Related papers: Generative Adversarial Phonology: Modeling unsupervised phonetic and phonological learning with neural networks

Generative Adversarial Phonology: Modeling unsupervised phonetic and phonological learning with neural networks

URL: http://arxiv.org/abs/2006.03965v1
Date: Sat, 6 Jun 2020 20:31:23 GMT
Title: Generative Adversarial Phonology: Modeling unsupervised phonetic and phonological learning with neural networks
Authors: Ga\v{s}per Begu\v{s}
Abstract summary: Training deep neural networks on well-understood dependencies in speech data can provide new insights into how they learn internal representations. This paper argues that acquisition of speech can be modeled as a dependency between random space and generated speech data in the Generative Adversarial Network architecture. We propose a methodology to uncover the network's internal representations that correspond to phonetic and phonological properties.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Training deep neural networks on well-understood dependencies in speech data can provide new insights into how they learn internal representations. This paper argues that acquisition of speech can be modeled as a dependency between random space and generated speech data in the Generative Adversarial Network architecture and proposes a methodology to uncover the network's internal representations that correspond to phonetic and phonological properties. The Generative Adversarial architecture is uniquely appropriate for modeling phonetic and phonological learning because the network is trained on unannotated raw acoustic data and learning is unsupervised without any language-specific assumptions or pre-assumed levels of abstraction. A Generative Adversarial Network was trained on an allophonic distribution in English. The network successfully learns the allophonic alternation: the network's generated speech signal contains the conditional distribution of aspiration duration. The paper proposes a technique for establishing the network's internal representations that identifies latent variables that correspond to, for example, presence of [s] and its spectral properties. By manipulating these variables, we actively control the presence of [s] and its frication amplitude in the generated outputs. This suggests that the network learns to use latent variables as an approximation of phonetic and phonological representations. Crucially, we observe that the dependencies learned in training extend beyond the training interval, which allows for additional exploration of learning representations. The paper also discusses how the network's architecture and innovative outputs resemble and differ from linguistic behavior in language acquisition, speech disorders, and speech errors, and how well-understood dependencies in speech data can help us interpret how neural networks learn their representations.

Related papers

Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning [69.8008228833895]
We propose a small-sized generative neural network equipped with a continual learning mechanism. Our model prioritizes interpretability and demonstrates the advantages of online learning.
arXiv Detail & Related papers (2024-12-23T10:23:47Z)
Explaining Spectrograms in Machine Learning: A Study on Neural Networks for Speech Classification [2.4472308031704073]
This study investigates discriminative patterns learned by neural networks for accurate speech classification. By examining the activations and features of neural networks for vowel classification, we gain insights into what the networks "see" in spectrograms.
arXiv Detail & Related papers (2024-07-10T07:37:18Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Color Overmodification Emerges from Data-Driven Learning and Pragmatic Reasoning [53.088796874029974]
We show that speakers' referential expressions depart from communicative ideals in ways that help illuminate the nature of pragmatic language use. By adopting neural networks as learning agents, we show that overmodification is more likely with environmental features that are infrequent or salient.
arXiv Detail & Related papers (2022-05-18T18:42:43Z)
Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition [48.56414496900755]
This work uses a neural implementation of convolutive sparse matrix factorization to decompose the articulatory data into interpretable gestures and gestural scores. Phoneme recognition experiments were additionally performed to show that gestural scores indeed code phonological information successfully.
arXiv Detail & Related papers (2022-04-01T14:25:19Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language [148.0843278195794]
We propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning. Our approach uses a dictionary learning-based method of learning relations between videos and their paired text descriptions.
arXiv Detail & Related papers (2020-11-18T20:21:19Z)
Deep Sound Change: Deep and Iterative Learning, Convolutional Neural Networks, and Language Change [0.0]
This paper proposes a framework for modeling sound change that combines deep learning and iterative learning. It argues that several properties of sound change emerge from the proposed architecture.
arXiv Detail & Related papers (2020-11-10T23:49:09Z)
Local and non-local dependency learning and emergence of rule-like representations in speech data by Deep Convolutional Generative Adversarial Networks [0.0]
This paper argues that training GANs on local and non-local dependencies in speech data offers insights into how deep neural networks discretize continuous data.
arXiv Detail & Related papers (2020-09-27T00:02:34Z)
CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks [0.0]
Lexical learning is modeled as emergent from an architecture that forces a deep neural network to output data. Networks trained on lexical items from TIMIT learn to encode unique information corresponding to lexical items in the form of categorical variables in their latent space. We show that phonetic and phonological representations learned by the network can be productively recombined and directly paralleled to productivity in human speech.
arXiv Detail & Related papers (2020-06-04T15:33:55Z)
Untangling in Invariant Speech Recognition [17.996356271398295]
We study how information is untangled within neural networks trained to recognize speech. We observe speaker-specific nuisance variations are discarded by the network's hierarchy, whereas task-relevant properties are untangled in later layers. We find that the deep representations carry out significant temporal untangling by efficiently extracting task-relevant features at each time step of the computation.
arXiv Detail & Related papers (2020-03-03T20:48:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.