Learning Music Helps You Read: Using Transfer to Study Linguistic
Structure in Language Models
- URL: http://arxiv.org/abs/2004.14601v3
- Date: Fri, 30 Oct 2020 17:41:21 GMT
- Title: Learning Music Helps You Read: Using Transfer to Study Linguistic
Structure in Language Models
- Authors: Isabel Papadimitriou and Dan Jurafsky
- Abstract summary: Training LSTMs on latent structure (MIDI music or Java code) improves test performance on natural language.
Experiments on transfer between natural languages controlling for vocabulary overlap show that zero-shot performance on a test language is highly correlated with typological similarity to the training language.
- Score: 27.91397366776451
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose transfer learning as a method for analyzing the encoding of
grammatical structure in neural language models. We train LSTMs on
non-linguistic data and evaluate their performance on natural language to
assess which kinds of data induce generalizable structural features that LSTMs
can use for natural language. We find that training on non-linguistic data with
latent structure (MIDI music or Java code) improves test performance on natural
language, despite no overlap in surface form or vocabulary. To pinpoint the
kinds of abstract structure that models may be encoding to lead to this
improvement, we run similar experiments with two artificial parentheses
languages: one which has a hierarchical recursive structure, and a control
which has paired tokens but no recursion. Surprisingly, training a model on
either of these artificial languages leads to the same substantial gains when
testing on natural language. Further experiments on transfer between natural
languages controlling for vocabulary overlap show that zero-shot performance on
a test language is highly correlated with typological syntactic similarity to
the training language, suggesting that representations induced by pre-training
correspond to the cross-linguistic syntactic properties. Our results provide
insights into the ways that neural models represent abstract syntactic
structure, and also about the kind of structural inductive biases which allow
for natural language acquisition.
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - Transparency at the Source: Evaluating and Interpreting Language Models
With Access to the True Distribution [4.01799362940916]
We present a setup for training, evaluating and interpreting neural language models, that uses artificial, language-like data.
The data is generated using a massive probabilistic grammar, that is itself derived from a large natural language corpus.
With access to the underlying true source, our results show striking differences and outcomes in learning dynamics between different classes of words.
arXiv Detail & Related papers (2023-10-23T12:03:01Z) - Benchmarking Language Models for Code Syntax Understanding [79.11525961219591]
Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding.
In this work, we perform the first thorough benchmarking of the state-of-the-art pre-trained models for identifying the syntactic structures of programs.
Our findings point out key limitations of existing pre-training methods for programming languages, and suggest the importance of modeling code syntactic structures.
arXiv Detail & Related papers (2022-10-26T04:47:18Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Is neural language acquisition similar to natural? A chronological
probing study [0.0515648410037406]
We present the chronological probing study of transformer English models such as MultiBERT and T5.
We compare the information about the language learned by the models in the process of training on corpora.
The results show that 1) linguistic information is acquired in the early stages of training 2) both language models demonstrate capabilities to capture various features from various levels of language.
arXiv Detail & Related papers (2022-07-01T17:24:11Z) - Linking Emergent and Natural Languages via Corpus Transfer [98.98724497178247]
We propose a novel way to establish a link by corpus transfer between emergent languages and natural languages.
Our approach showcases non-trivial transfer benefits for two different tasks -- language modeling and image captioning.
We also introduce a novel metric to predict the transferability of an emergent language by translating emergent messages to natural language captions grounded on the same images.
arXiv Detail & Related papers (2022-03-24T21:24:54Z) - Pretraining with Artificial Language: Studying Transferable Knowledge in
Language Models [32.27333420000134]
We investigate what kind of structural knowledge learned in neural network encoders is transferable to processing natural language.
We design artificial languages with structural properties that mimic natural language, pretrain encoders on the data, and see how much performance the encoder exhibits on downstream tasks in natural language.
arXiv Detail & Related papers (2022-03-19T13:29:48Z) - Low-Dimensional Structure in the Space of Language Representations is
Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings.
We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI.
This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z) - Examining the Inductive Bias of Neural Language Models with Artificial
Languages [42.699545862522214]
We propose a novel method for investigating the inductive biases of language models using artificial languages.
This constitutes a fully controlled causal framework, and demonstrates how grammar engineering can serve as a useful tool for analyzing neural models.
arXiv Detail & Related papers (2021-06-02T09:34:32Z) - Linguistic Typology Features from Text: Inferring the Sparse Features of
World Atlas of Language Structures [73.06435180872293]
We construct a recurrent neural network predictor based on byte embeddings and convolutional layers.
We show that some features from various linguistic types can be predicted reliably.
arXiv Detail & Related papers (2020-04-30T21:00:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.