CiwGAN and fiwGAN: Encoding information in acoustic data to model
lexical learning with Generative Adversarial Networks
- URL: http://arxiv.org/abs/2006.02951v3
- Date: Wed, 28 Jul 2021 10:31:31 GMT
- Title: CiwGAN and fiwGAN: Encoding information in acoustic data to model
lexical learning with Generative Adversarial Networks
- Authors: Ga\v{s}per Begu\v{s}
- Abstract summary: Lexical learning is modeled as emergent from an architecture that forces a deep neural network to output data.
Networks trained on lexical items from TIMIT learn to encode unique information corresponding to lexical items in the form of categorical variables in their latent space.
We show that phonetic and phonological representations learned by the network can be productively recombined and directly paralleled to productivity in human speech.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How can deep neural networks encode information that corresponds to words in
human speech into raw acoustic data? This paper proposes two neural network
architectures for modeling unsupervised lexical learning from raw acoustic
inputs, ciwGAN (Categorical InfoWaveGAN) and fiwGAN (Featural InfoWaveGAN),
that combine a Deep Convolutional GAN architecture for audio data (WaveGAN;
arXiv:1705.07904) with an information theoretic extension of GAN -- InfoGAN
(arXiv:1606.03657), and propose a new latent space structure that can model
featural learning simultaneously with a higher level classification and allows
for a very low-dimension vector representation of lexical items. Lexical
learning is modeled as emergent from an architecture that forces a deep neural
network to output data such that unique information is retrievable from its
acoustic outputs. The networks trained on lexical items from TIMIT learn to
encode unique information corresponding to lexical items in the form of
categorical variables in their latent space. By manipulating these variables,
the network outputs specific lexical items. The network occasionally outputs
innovative lexical items that violate training data, but are linguistically
interpretable and highly informative for cognitive modeling and neural network
interpretability. Innovative outputs suggest that phonetic and phonological
representations learned by the network can be productively recombined and
directly paralleled to productivity in human speech: a fiwGAN network trained
on `suit' and `dark' outputs innovative `start', even though it never saw
`start' or even a [st] sequence in the training data. We also argue that
setting latent featural codes to values well beyond training range results in
almost categorical generation of prototypical lexical items and reveals
underlying values of each latent code.
Related papers
- Deep Learning for real-time neural decoding of grasp [0.0]
We present a Deep Learning-based approach to the decoding of neural signals for grasp type classification.
The main goal of the presented approach is to improve over state-of-the-art decoding accuracy without relying on any prior neuroscience knowledge.
arXiv Detail & Related papers (2023-11-02T08:26:29Z) - Modeling speech recognition and synthesis simultaneously: Encoding and
decoding lexical and sublexical semantic information into speech with no
direct access to speech data [0.0]
We introduce, to our knowledge, the most challenging objective in unsupervised lexical learning: an unsupervised network that must learn to assign unique representations for lexical items.
Strong evidence in favor of lexical learning emerges.
The architecture that combines the production and perception principles is thus able to learn to decode unique information from raw acoustic data in an unsupervised manner without ever accessing real training data.
arXiv Detail & Related papers (2022-03-22T06:04:34Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - NSL: Hybrid Interpretable Learning From Noisy Raw Data [66.15862011405882]
This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data.
NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics.
We demonstrate that NSL is able to learn robust rules from MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines.
arXiv Detail & Related papers (2020-12-09T13:02:44Z) - Deep Sound Change: Deep and Iterative Learning, Convolutional Neural
Networks, and Language Change [0.0]
This paper proposes a framework for modeling sound change that combines deep learning and iterative learning.
It argues that several properties of sound change emerge from the proposed architecture.
arXiv Detail & Related papers (2020-11-10T23:49:09Z) - Local and non-local dependency learning and emergence of rule-like
representations in speech data by Deep Convolutional Generative Adversarial
Networks [0.0]
This paper argues that training GANs on local and non-local dependencies in speech data offers insights into how deep neural networks discretize continuous data.
arXiv Detail & Related papers (2020-09-27T00:02:34Z) - Reservoir Memory Machines as Neural Computers [70.5993855765376]
Differentiable neural computers extend artificial neural networks with an explicit memory without interference.
We achieve some of the computational capabilities of differentiable neural computers with a model that can be trained very efficiently.
arXiv Detail & Related papers (2020-09-14T12:01:30Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z) - Generative Adversarial Phonology: Modeling unsupervised phonetic and
phonological learning with neural networks [0.0]
Training deep neural networks on well-understood dependencies in speech data can provide new insights into how they learn internal representations.
This paper argues that acquisition of speech can be modeled as a dependency between random space and generated speech data in the Generative Adversarial Network architecture.
We propose a methodology to uncover the network's internal representations that correspond to phonetic and phonological properties.
arXiv Detail & Related papers (2020-06-06T20:31:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.