Recurrent babbling: evaluating the acquisition of grammar from limited
input data
- URL: http://arxiv.org/abs/2010.04637v1
- Date: Fri, 9 Oct 2020 15:30:05 GMT
- Title: Recurrent babbling: evaluating the acquisition of grammar from limited
input data
- Authors: Ludovica Pannitto and Aur\'elie Herbelot
- Abstract summary: Recurrent Neural Networks (RNNs) have been shown to capture various aspects of syntax from raw linguistic input.
This paper remedies this state of affairs by training a Long Short-Term Memory network (LSTM) over a realistically sized subset of child-directed input.
- Score: 0.30458514384586405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent Neural Networks (RNNs) have been shown to capture various aspects
of syntax from raw linguistic input. In most previous experiments, however,
learning happens over unrealistic corpora, which do not reflect the type and
amount of data a child would be exposed to. This paper remedies this state of
affairs by training a Long Short-Term Memory network (LSTM) over a
realistically sized subset of child-directed input. The behaviour of the
network is analysed over time using a novel methodology which consists in
quantifying the level of grammatical abstraction in the model's generated
output (its "babbling"), compared to the language it has been exposed to. We
show that the LSTM indeed abstracts new structuresas learning proceeds.
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - A systematic investigation of learnability from single child linguistic input [12.279543223376935]
Language models (LMs) have demonstrated remarkable proficiency in generating linguistically coherent text.
However, a significant gap exists between the training data for these models and the linguistic input a child receives.
Our research focuses on training LMs on subsets of a single child's linguistic input.
arXiv Detail & Related papers (2024-02-12T18:58:58Z) - Evaluating Neural Language Models as Cognitive Models of Language
Acquisition [4.779196219827507]
We argue that some of the most prominent benchmarks for evaluating the syntactic capacities of neural language models may not be sufficiently rigorous.
When trained on small-scale data modeling child language acquisition, the LMs can be readily matched by simple baseline models.
We conclude with suggestions for better connecting LMs with the empirical study of child language acquisition.
arXiv Detail & Related papers (2023-10-31T00:16:17Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Preliminary study on using vector quantization latent spaces for TTS/VC
systems with consistent performance [55.10864476206503]
We investigate the use of quantized vectors to model the latent linguistic embedding.
By enforcing different policies over the latent spaces in the training, we are able to obtain a latent linguistic embedding.
Our experiments show that the voice cloning system built with vector quantization has only a small degradation in terms of perceptive evaluations.
arXiv Detail & Related papers (2021-06-25T07:51:35Z) - Cross-lingual Approach to Abstractive Summarization [0.0]
Cross-lingual model transfers are successfully applied in low-resource languages.
We used a pretrained English summarization model based on deep neural networks and sequence-to-sequence architecture.
We developed several models with different proportions of target language data for fine-tuning.
arXiv Detail & Related papers (2020-12-08T09:30:38Z) - Multi-timescale Representation Learning in LSTM Language Models [69.98840820213937]
Language models must capture statistical dependencies between words at timescales ranging from very short to very long.
We derived a theory for how the memory gating mechanism in long short-term memory language models can capture power law decay.
Experiments showed that LSTM language models trained on natural English text learn to approximate this theoretical distribution.
arXiv Detail & Related papers (2020-09-27T02:13:38Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z) - An Evaluation of Recent Neural Sequence Tagging Models in Turkish Named
Entity Recognition [5.161531917413708]
We propose a transformer-based network with a conditional random field layer that leads to the state-of-the-art result.
Our study contributes to the literature that quantifies the impact of transfer learning on processing morphologically rich languages.
arXiv Detail & Related papers (2020-05-14T06:54:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.