Local and non-local dependency learning and emergence of rule-like
representations in speech data by Deep Convolutional Generative Adversarial
Networks
- URL: http://arxiv.org/abs/2009.12711v2
- Date: Wed, 28 Jul 2021 07:28:49 GMT
- Title: Local and non-local dependency learning and emergence of rule-like
representations in speech data by Deep Convolutional Generative Adversarial
Networks
- Authors: Ga\v{s}per Begu\v{s}
- Abstract summary: This paper argues that training GANs on local and non-local dependencies in speech data offers insights into how deep neural networks discretize continuous data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper argues that training GANs on local and non-local dependencies in
speech data offers insights into how deep neural networks discretize continuous
data and how symbolic-like rule-based morphophonological processes emerge in a
deep convolutional architecture. Acquisition of speech has recently been
modeled as a dependency between latent space and data generated by GANs in
Begu\v{s} (2020b; arXiv:2006.03965), who models learning of a simple local
allophonic distribution. We extend this approach to test learning of local and
non-local phonological processes that include approximations of morphological
processes. We further parallel outputs of the model to results of a behavioral
experiment where human subjects are trained on the data used for training the
GAN network. Four main conclusions emerge: (i) the networks provide useful
information for computational models of speech acquisition even if trained on a
comparatively small dataset of an artificial grammar learning experiment; (ii)
local processes are easier to learn than non-local processes, which matches
both behavioral data in human subjects and typology in the world's languages.
This paper also proposes (iii) how we can actively observe the network's
progress in learning and explore the effect of training steps on learning
representations by keeping latent space constant across different training
steps. Finally, this paper shows that (iv) the network learns to encode the
presence of a prefix with a single latent variable; by interpolating this
variable, we can actively observe the operation of a non-local phonological
process. The proposed technique for retrieving learning representations has
general implications for our understanding of how GANs discretize continuous
speech data and suggests that rule-like generalizations in the training data
are represented as an interaction between variables in the network's latent
space.
Related papers
- A distributional simplicity bias in the learning dynamics of transformers [50.91742043564049]
We show that transformers, trained on natural language data, also display a simplicity bias.
Specifically, they sequentially learn many-body interactions among input tokens, reaching a saturation point in the prediction error for low-degree interactions.
This approach opens up the possibilities of studying how interactions of different orders in the data affect learning, in natural language processing and beyond.
arXiv Detail & Related papers (2024-10-25T15:39:34Z) - The mechanistic basis of data dependence and abrupt learning in an
in-context classification task [0.3626013617212666]
We show that specific distributional properties inherent in language control the trade-off or simultaneous appearance of two forms of learning.
In-context learning is driven by the abrupt emergence of an induction head, which subsequently competes with in-weights learning.
We propose that the sharp transitions in attention-based networks arise due to a specific chain of multi-layer operations necessary to achieve ICL.
arXiv Detail & Related papers (2023-12-03T20:53:41Z) - Approaching an unknown communication system by latent space exploration
and causal inference [5.026037329977691]
This paper proposes a methodology for discovering meaningful properties in data by exploring the latent space of unsupervised deep generative models.
We combine manipulation of individual latent variables to extreme values with methods inspired by causal inference.
We show that this method yields insights for model interpretability.
arXiv Detail & Related papers (2023-03-20T08:09:13Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - Reprogramming Language Models for Molecular Representation Learning [65.00999660425731]
We propose Representation Reprogramming via Dictionary Learning (R2DL) for adversarially reprogramming pretrained language models for molecular learning tasks.
The adversarial program learns a linear transformation between a dense source model input space (language data) and a sparse target model input space (e.g., chemical and biological molecule data) using a k-SVD solver.
R2DL achieves the baseline established by state of the art toxicity prediction models trained on domain-specific data and outperforms the baseline in a limited training-data setting.
arXiv Detail & Related papers (2020-12-07T05:50:27Z) - Network Classifiers Based on Social Learning [71.86764107527812]
We propose a new way of combining independently trained classifiers over space and time.
The proposed architecture is able to improve prediction performance over time with unlabeled data.
We show that this strategy results in consistent learning with high probability, and it yields a robust structure against poorly trained classifiers.
arXiv Detail & Related papers (2020-10-23T11:18:20Z) - Systematic Generalization on gSCAN with Language Conditioned Embedding [19.39687991647301]
Systematic Generalization refers to a learning algorithm's ability to extrapolate learned behavior to unseen situations.
We propose a novel method that learns objects' contextualized embeddings with dynamic message passing conditioned on the input natural language.
arXiv Detail & Related papers (2020-09-11T17:35:05Z) - Embodied Self-supervised Learning by Coordinated Sampling and Training [14.107020105091662]
We propose a novel self-supervised approach to solve inverse problems by employing the corresponding physical forward process.
The proposed approach works in an analysis-by-synthesis manner to learn an inference network by iteratively sampling and training.
We prove the feasibility of the proposed method by tackling the acoustic-to-articulatory inversion problem to infer articulatory information from speech.
arXiv Detail & Related papers (2020-06-20T14:05:47Z) - Generative Adversarial Phonology: Modeling unsupervised phonetic and
phonological learning with neural networks [0.0]
Training deep neural networks on well-understood dependencies in speech data can provide new insights into how they learn internal representations.
This paper argues that acquisition of speech can be modeled as a dependency between random space and generated speech data in the Generative Adversarial Network architecture.
We propose a methodology to uncover the network's internal representations that correspond to phonetic and phonological properties.
arXiv Detail & Related papers (2020-06-06T20:31:23Z) - Data Augmentation for Spoken Language Understanding via Pretrained
Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity.
We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.