Examining the Inductive Bias of Neural Language Models with Artificial
Languages
- URL: http://arxiv.org/abs/2106.01044v1
- Date: Wed, 2 Jun 2021 09:34:32 GMT
- Title: Examining the Inductive Bias of Neural Language Models with Artificial
Languages
- Authors: Jennifer C. White and Ryan Cotterell
- Abstract summary: We propose a novel method for investigating the inductive biases of language models using artificial languages.
This constitutes a fully controlled causal framework, and demonstrates how grammar engineering can serve as a useful tool for analyzing neural models.
- Score: 42.699545862522214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since language models are used to model a wide variety of languages, it is
natural to ask whether the neural architectures used for the task have
inductive biases towards modeling particular types of languages. Investigation
of these biases has proved complicated due to the many variables that appear in
the experimental setup. Languages vary in many typological dimensions, and it
is difficult to single out one or two to investigate without the others acting
as confounders. We propose a novel method for investigating the inductive
biases of language models using artificial languages. These languages are
constructed to allow us to create parallel corpora across languages that differ
only in the typological feature being investigated, such as word order. We then
use them to train and test language models. This constitutes a fully controlled
causal framework, and demonstrates how grammar engineering can serve as a
useful tool for analyzing neural models. Using this method, we find that
commonly used neural architectures exhibit different inductive biases: LSTMs
display little preference with respect to word ordering, while transformers
display a clear preference for some orderings over others. Further, we find
that neither the inductive bias of the LSTM nor that of the transformer appears
to reflect any tendencies that we see in attested natural languages.
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - Injecting structural hints: Using language models to study inductive
biases in language learning [40.8902073270634]
We inject inductive bias into language models by pretraining on formally-structured data.
We then evaluate the biased learners' ability to learn typologically-diverse natural languages.
We show that non-context-free relationships form the best inductive biases.
arXiv Detail & Related papers (2023-04-25T18:00:08Z) - Language Models as Inductive Reasoners [125.99461874008703]
We propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts.
We create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language.
We provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts.
arXiv Detail & Related papers (2022-12-21T11:12:14Z) - Universal and Independent: Multilingual Probing Framework for Exhaustive
Model Interpretation and Evaluation [0.04199844472131922]
We present and apply the GUI-assisted framework allowing us to easily probe a massive number of languages.
Most of the regularities revealed in the mBERT model are typical for the western-European languages.
Our framework can be integrated with the existing probing toolboxes, model cards, and leaderboards.
arXiv Detail & Related papers (2022-10-24T13:41:17Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Is neural language acquisition similar to natural? A chronological
probing study [0.0515648410037406]
We present the chronological probing study of transformer English models such as MultiBERT and T5.
We compare the information about the language learned by the models in the process of training on corpora.
The results show that 1) linguistic information is acquired in the early stages of training 2) both language models demonstrate capabilities to capture various features from various levels of language.
arXiv Detail & Related papers (2022-07-01T17:24:11Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words.
Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z) - Universal linguistic inductive biases via meta-learning [36.43388942327124]
It is unclear which inductive biases can explain observed patterns in language acquisition.
We introduce a framework for giving linguistic inductive biases to a neural network model.
We demonstrate this framework with a case study based on syllable structure.
arXiv Detail & Related papers (2020-06-29T19:15:10Z) - Learning Music Helps You Read: Using Transfer to Study Linguistic
Structure in Language Models [27.91397366776451]
Training LSTMs on latent structure (MIDI music or Java code) improves test performance on natural language.
Experiments on transfer between natural languages controlling for vocabulary overlap show that zero-shot performance on a test language is highly correlated with typological similarity to the training language.
arXiv Detail & Related papers (2020-04-30T06:24:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.