Universal linguistic inductive biases via meta-learning
- URL: http://arxiv.org/abs/2006.16324v1
- Date: Mon, 29 Jun 2020 19:15:10 GMT
- Title: Universal linguistic inductive biases via meta-learning
- Authors: R. Thomas McCoy, Erin Grant, Paul Smolensky, Thomas L. Griffiths, Tal
Linzen
- Abstract summary: It is unclear which inductive biases can explain observed patterns in language acquisition.
We introduce a framework for giving linguistic inductive biases to a neural network model.
We demonstrate this framework with a case study based on syllable structure.
- Score: 36.43388942327124
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How do learners acquire languages from the limited data available to them?
This process must involve some inductive biases - factors that affect how a
learner generalizes - but it is unclear which inductive biases can explain
observed patterns in language acquisition. To facilitate computational modeling
aimed at addressing this question, we introduce a framework for giving
particular linguistic inductive biases to a neural network model; such a model
can then be used to empirically explore the effects of those inductive biases.
This framework disentangles universal inductive biases, which are encoded in
the initial values of a neural network's parameters, from non-universal
factors, which the neural network must learn from data in a given language. The
initial state that encodes the inductive biases is found with meta-learning, a
technique through which a model discovers how to acquire new languages more
easily via exposure to many possible languages. By controlling the properties
of the languages that are used during meta-learning, we can control the
inductive biases that meta-learning imparts. We demonstrate this framework with
a case study based on syllable structure. First, we specify the inductive
biases that we intend to give our model, and then we translate those inductive
biases into a space of languages from which a model can meta-learn. Finally,
using existing analysis techniques, we verify that our approach has imparted
the linguistic inductive biases that it was intended to impart.
Related papers
- Modeling rapid language learning by distilling Bayesian priors into
artificial neural networks [18.752638142258668]
We show that learning from limited naturalistic data is possible with an approach that combines the strong inductive biases of a Bayesian model with the flexible representations of a neural network.
The resulting system can learn formal linguistic patterns from a small number of examples.
It can also learn aspects of English syntax from a corpus of natural language.
arXiv Detail & Related papers (2023-05-24T04:11:59Z) - Injecting structural hints: Using language models to study inductive
biases in language learning [40.8902073270634]
We inject inductive bias into language models by pretraining on formally-structured data.
We then evaluate the biased learners' ability to learn typologically-diverse natural languages.
We show that non-context-free relationships form the best inductive biases.
arXiv Detail & Related papers (2023-04-25T18:00:08Z) - Language Models as Inductive Reasoners [125.99461874008703]
We propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts.
We create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language.
We provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts.
arXiv Detail & Related papers (2022-12-21T11:12:14Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Is neural language acquisition similar to natural? A chronological
probing study [0.0515648410037406]
We present the chronological probing study of transformer English models such as MultiBERT and T5.
We compare the information about the language learned by the models in the process of training on corpora.
The results show that 1) linguistic information is acquired in the early stages of training 2) both language models demonstrate capabilities to capture various features from various levels of language.
arXiv Detail & Related papers (2022-07-01T17:24:11Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Examining the Inductive Bias of Neural Language Models with Artificial
Languages [42.699545862522214]
We propose a novel method for investigating the inductive biases of language models using artificial languages.
This constitutes a fully controlled causal framework, and demonstrates how grammar engineering can serve as a useful tool for analyzing neural models.
arXiv Detail & Related papers (2021-06-02T09:34:32Z) - Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason
Over Implicit Knowledge [96.92252296244233]
Large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control.
We show that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements.
Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.
arXiv Detail & Related papers (2020-06-11T17:02:20Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.