Structural Priming Demonstrates Abstract Grammatical Representations in
Multilingual Language Models
- URL: http://arxiv.org/abs/2311.09194v1
- Date: Wed, 15 Nov 2023 18:39:56 GMT
- Title: Structural Priming Demonstrates Abstract Grammatical Representations in
Multilingual Language Models
- Authors: James A. Michaelov, Catherine Arnett, Tyler A. Chang, Benjamin K.
Bergen
- Abstract summary: We find evidence for abstract monolingual and crosslingual grammatical representations in large language models.
Results demonstrate that grammatical representations in multilingual language models are not only similar across languages, but they can causally influence text produced in different languages.
- Score: 6.845954748361076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abstract grammatical knowledge - of parts of speech and grammatical patterns
- is key to the capacity for linguistic generalization in humans. But how
abstract is grammatical knowledge in large language models? In the human
literature, compelling evidence for grammatical abstraction comes from
structural priming. A sentence that shares the same grammatical structure as a
preceding sentence is processed and produced more readily. Because confounds
exist when using stimuli in a single language, evidence of abstraction is even
more compelling from crosslingual structural priming, where use of a syntactic
structure in one language primes an analogous structure in another language. We
measure crosslingual structural priming in large language models, comparing
model behavior to human experimental results from eight crosslingual
experiments covering six languages, and four monolingual structural priming
experiments in three non-English languages. We find evidence for abstract
monolingual and crosslingual grammatical representations in the models that
function similarly to those found in humans. These results demonstrate that
grammatical representations in multilingual language models are not only
similar across languages, but they can causally influence text produced in
different languages.
Related papers
- Analyzing The Language of Visual Tokens [48.62180485759458]
We take a natural-language-centric approach to analyzing discrete visual languages.
We show that higher token innovation drives greater entropy and lower compression, with tokens predominantly representing object parts.
We also show that visual languages lack cohesive grammatical structures, leading to higher perplexity and weaker hierarchical organization compared to natural languages.
arXiv Detail & Related papers (2024-11-07T18:59:28Z) - Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement [1.4335183427838039]
We take the approach of developing curated synthetic data on a large scale, with specific properties.
We use a new multiple-choice task and datasets, Blackbird Language Matrices, to focus on a specific grammatical structural phenomenon.
We show that despite having been trained on multilingual texts in a consistent manner, multilingual pretrained language models have language-specific differences.
arXiv Detail & Related papers (2024-09-10T14:58:55Z) - The Role of Language Imbalance in Cross-lingual Generalisation: Insights from Cloned Language Experiments [57.273662221547056]
In this study, we investigate an unintuitive novel driver of cross-lingual generalisation: language imbalance.
We observe that the existence of a predominant language during training boosts the performance of less frequent languages.
As we extend our analysis to real languages, we find that infrequent languages still benefit from frequent ones, yet whether language imbalance causes cross-lingual generalisation there is not conclusive.
arXiv Detail & Related papers (2024-04-11T17:58:05Z) - Crosslingual Structural Priming and the Pre-Training Dynamics of
Bilingual Language Models [6.845954748361076]
We use structural priming to test for abstract grammatical representations with causal effects on model outputs.
We extend the approach to a Dutch-English bilingual setting, and we evaluate a Dutch-English language model during pre-training.
We find that crosslingual structural priming effects emerge early after exposure to the second language, with less than 1M tokens of data in that language.
arXiv Detail & Related papers (2023-10-11T22:57:03Z) - Multilingual Multi-Figurative Language Detection [14.799109368073548]
figurative language understanding is highly understudied in a multilingual setting.
We introduce multilingual multi-figurative language modelling, and provide a benchmark for sentence-level figurative language detection.
We develop a framework for figurative language detection based on template-based prompt learning.
arXiv Detail & Related papers (2023-05-31T18:52:41Z) - Same Neurons, Different Languages: Probing Morphosyntax in Multilingual
Pre-trained Models [84.86942006830772]
We conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar.
We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe.
arXiv Detail & Related papers (2022-05-04T12:22:31Z) - Cross-Lingual Ability of Multilingual Masked Language Models: A Study of
Language Structure [54.01613740115601]
We study three language properties: constituent order, composition and word co-occurrence.
Our main conclusion is that the contribution of constituent order and word co-occurrence is limited, while the composition is more crucial to the success of cross-linguistic transfer.
arXiv Detail & Related papers (2022-03-16T07:09:35Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - How Good is Your Tokenizer? On the Monolingual Performance of
Multilingual Language Models [96.32118305166412]
We study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual downstream tasks.
We find that languages which are adequately represented in the multilingual model's vocabulary exhibit negligible performance decreases over their monolingual counterparts.
arXiv Detail & Related papers (2020-12-31T14:11:00Z) - Learning Music Helps You Read: Using Transfer to Study Linguistic
Structure in Language Models [27.91397366776451]
Training LSTMs on latent structure (MIDI music or Java code) improves test performance on natural language.
Experiments on transfer between natural languages controlling for vocabulary overlap show that zero-shot performance on a test language is highly correlated with typological similarity to the training language.
arXiv Detail & Related papers (2020-04-30T06:24:03Z) - An Empirical Study of Factors Affecting Language-Independent Models [11.976665726887733]
We show that language-independent models can be comparable to or even outperforms the models trained using monolingual data.
We experiment language-independent models with many different languages and show that they are more suitable for typologically similar languages.
arXiv Detail & Related papers (2019-12-30T22:41:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.