Syntactic Persistence in Language Models: Priming as a Window into
Abstract Language Representations
- URL: http://arxiv.org/abs/2109.14989v1
- Date: Thu, 30 Sep 2021 10:38:38 GMT
- Title: Syntactic Persistence in Language Models: Priming as a Window into
Abstract Language Representations
- Authors: Arabella Sinclair, Jaap Jumelet, Willem Zuidema, Raquel Fern\'andez
- Abstract summary: We investigate the extent to which modern, neural language models are susceptible to syntactic priming.
We introduce a novel metric and release Prime-LM, a large corpus where we control for various linguistic factors which interact with priming strength.
We report surprisingly strong priming effects when priming with multiple sentences, each with different words and meaning but with identical syntactic structure.
- Score: 0.38498574327875945
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We investigate the extent to which modern, neural language models are
susceptible to syntactic priming, the phenomenon where the syntactic structure
of a sentence makes the same structure more probable in a follow-up sentence.
We explore how priming can be used to study the nature of the syntactic
knowledge acquired by these models. We introduce a novel metric and release
Prime-LM, a large corpus where we control for various linguistic factors which
interact with priming strength. We find that recent large Transformer models
indeed show evidence of syntactic priming, but also that the syntactic
generalisations learned by these models are to some extent modulated by
semantic information. We report surprisingly strong priming effects when
priming with multiple sentences, each with different words and meaning but with
identical syntactic structure. We conclude that the syntactic priming paradigm
is a highly useful, additional tool for gaining insights into the capacities of
language models.
Related papers
- Analyzing The Language of Visual Tokens [48.62180485759458]
We take a natural-language-centric approach to analyzing discrete visual languages.
We show that higher token innovation drives greater entropy and lower compression, with tokens predominantly representing object parts.
We also show that visual languages lack cohesive grammatical structures, leading to higher perplexity and weaker hierarchical organization compared to natural languages.
arXiv Detail & Related papers (2024-11-07T18:59:28Z) - Reframing linguistic bootstrapping as joint inference using visually-grounded grammar induction models [31.006803764376475]
Semantic and syntactic bootstrapping posit that children use their prior knowledge of one linguistic domain, say syntactic relations, to help later acquire another, such as the meanings of new words.
Here, we argue that they are instead both contingent on a more general learning strategy for language acquisition: joint learning.
Using a series of neural visually-grounded grammar induction models, we demonstrate that both syntactic and semantic bootstrapping effects are strongest when syntax and semantics are learnt simultaneously.
arXiv Detail & Related papers (2024-06-17T18:01:06Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Large Language Models for Scientific Synthesis, Inference and
Explanation [56.41963802804953]
We show how large language models can perform scientific synthesis, inference, and explanation.
We show that the large language model can augment this "knowledge" by synthesizing from the scientific literature.
This approach has the further advantage that the large language model can explain the machine learning system's predictions.
arXiv Detail & Related papers (2023-10-12T02:17:59Z) - Robustness of the Random Language Model [0.0]
The model suggests a simple picture of first language learning as a type of annealing in the vast space of potential languages.
It implies a single continuous transition to grammatical syntax, at which the symmetry among potential words and categories is spontaneously broken.
Results are discussed in light of theory of first-language acquisition in linguistics, and recent successes in machine learning.
arXiv Detail & Related papers (2023-09-26T13:14:35Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Oracle Linguistic Graphs Complement a Pretrained Transformer Language
Model: A Cross-formalism Comparison [13.31232311913236]
We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling.
We find that, overall, semantic constituency structures are most useful to language modeling performance.
arXiv Detail & Related papers (2021-12-15T04:29:02Z) - Causal Analysis of Syntactic Agreement Mechanisms in Neural Language
Models [40.83377935276978]
This study applies causal mediation analysis to pre-trained neural language models.
We investigate the magnitude of models' preferences for grammatical inflections.
We observe two distinct mechanisms for producing subject-verb agreement depending on the syntactic structure.
arXiv Detail & Related papers (2021-06-10T23:50:51Z) - Structural Supervision Improves Few-Shot Learning and Syntactic
Generalization in Neural Language Models [47.42249565529833]
Humans can learn structural properties about a word from minimal experience.
We assess the ability of modern neural language models to reproduce this behavior in English.
arXiv Detail & Related papers (2020-10-12T14:12:37Z) - Exploiting Syntactic Structure for Better Language Modeling: A Syntactic
Distance Approach [78.77265671634454]
We make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called "syntactic distances"
Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.
arXiv Detail & Related papers (2020-05-12T15:35:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.