Do Language Models Learn Position-Role Mappings?
- URL: http://arxiv.org/abs/2202.03611v1
- Date: Tue, 8 Feb 2022 02:50:53 GMT
- Title: Do Language Models Learn Position-Role Mappings?
- Authors: Jackson Petty, Michael Wilson, Robert Frank
- Abstract summary: We test whether well-performing pertained language models (BERT, RoBERTa, and DistilBERT) exhibit knowledge of position-role mappings.
In Experiment 1, we show that these neural models do indeed recognize distinctions between theme and recipient roles.
In Experiment 2, we show that fine-tuning these language models on novel theme- and recipient-like tokens in one paradigm allows the models to make correct predictions about their placement in other paradigms.
- Score: 1.4548651568912523
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How is knowledge of position-role mappings in natural language learned? We
explore this question in a computational setting, testing whether a variety of
well-performing pertained language models (BERT, RoBERTa, and DistilBERT)
exhibit knowledge of these mappings, and whether this knowledge persists across
alternations in syntactic, structural, and lexical alternations. In Experiment
1, we show that these neural models do indeed recognize distinctions between
theme and recipient roles in ditransitive constructions, and that these
distinct patterns are shared across construction type. We strengthen this
finding in Experiment 2 by showing that fine-tuning these language models on
novel theme- and recipient-like tokens in one paradigm allows the models to
make correct predictions about their placement in other paradigms, suggesting
that the knowledge of these mappings is shared rather than independently
learned. We do, however, observe some limitations of this generalization when
tasks involve constructions with novel ditransitive verbs, hinting at a degree
of lexical specificity which underlies model performance.
Related papers
- Feature Interactions Reveal Linguistic Structure in Language Models [2.0178765779788495]
We study feature interactions in the context of feature attribution methods for post-hoc interpretability.
We work out a grey box methodology, in which we train models to perfection on a formal language classification task.
We show that under specific configurations, some methods are indeed able to uncover the grammatical rules acquired by a model.
arXiv Detail & Related papers (2023-06-21T11:24:41Z) - Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial.
We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments.
The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z) - Topics in Contextualised Attention Embeddings [7.6650522284905565]
Recent work has demonstrated that conducting clustering on the word-level contextual representations from a language model emulates word clusters that are discovered in latent topics of words from Latent Dirichlet Allocation.
The important question is how such topical word clusters are automatically formed, through clustering, in the language model when it has not been explicitly designed to model latent topics.
Using BERT and DistilBERT, we find that the attention framework plays a key role in modelling such word topic clusters.
arXiv Detail & Related papers (2023-01-11T07:26:19Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Interpreting Language Models with Contrastive Explanations [99.7035899290924]
Language models must consider various features to predict a token, such as its part of speech, number, tense, or semantics.
Existing explanation methods conflate evidence for all these features into a single explanation, which is less interpretable for human understanding.
We show that contrastive explanations are quantifiably better than non-contrastive explanations in verifying major grammatical phenomena.
arXiv Detail & Related papers (2022-02-21T18:32:24Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Schr\"odinger's Tree -- On Syntax and Neural Language Models [10.296219074343785]
Language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities.
We observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form.
We outline the implications of the different types of research questions exhibited in studies on syntax.
arXiv Detail & Related papers (2021-10-17T18:25:23Z) - Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words.
Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z) - Modelling Compositionality and Structure Dependence in Natural Language [0.12183405753834563]
Drawing on linguistics and set theory, a formalisation of these ideas is presented in the first half of this thesis.
We see how cognitive systems that process language need to have certain functional constraints.
Using the advances of word embedding techniques, a model of relational learning is simulated.
arXiv Detail & Related papers (2020-11-22T17:28:50Z) - Temporal Embeddings and Transformer Models for Narrative Text
Understanding [72.88083067388155]
We present two approaches to narrative text understanding for character relationship modelling.
The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time.
A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters.
arXiv Detail & Related papers (2020-03-19T14:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.