Cross-Linguistic Syntactic Evaluation of Word Prediction Models
- URL: http://arxiv.org/abs/2005.00187v2
- Date: Thu, 21 May 2020 14:19:52 GMT
- Title: Cross-Linguistic Syntactic Evaluation of Word Prediction Models
- Authors: Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou, Natalia
Talmina, Tal Linzen
- Abstract summary: We investigate how neural word prediction models' ability to learn syntax varies by language.
CLAMS includes subject-verb agreement challenge sets for English, French, German, Hebrew and Russian.
We use CLAMS to evaluate LSTM language models as well as monolingual and multilingual BERT.
- Score: 25.39896327641704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A range of studies have concluded that neural word prediction models can
distinguish grammatical from ungrammatical sentences with high accuracy.
However, these studies are based primarily on monolingual evidence from
English. To investigate how these models' ability to learn syntax varies by
language, we introduce CLAMS (Cross-Linguistic Assessment of Models on Syntax),
a syntactic evaluation suite for monolingual and multilingual models. CLAMS
includes subject-verb agreement challenge sets for English, French, German,
Hebrew and Russian, generated from grammars we develop. We use CLAMS to
evaluate LSTM language models as well as monolingual and multilingual BERT.
Across languages, monolingual LSTMs achieved high accuracy on dependencies
without attractors, and generally poor accuracy on agreement across object
relative clauses. On other constructions, agreement accuracy was generally
higher in languages with richer morphology. Multilingual models generally
underperformed monolingual models. Multilingual BERT showed high syntactic
accuracy on English, but noticeable deficiencies in other languages.
Related papers
- Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts.
We find that Llama Instruct and Mistral models exhibit high degrees of language confusion.
We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z) - Cross-Lingual Consistency of Factual Knowledge in Multilingual Language
Models [2.6626950367610402]
We study the cross-lingual consistency (CLC) of factual knowledge in various multilingual PLMs.
We propose a Ranking-based Consistency (RankC) metric to evaluate knowledge consistency across languages independently from accuracy.
arXiv Detail & Related papers (2023-10-16T13:19:17Z) - Causal Analysis of Syntactic Agreement Neurons in Multilingual Language
Models [28.036233760742125]
We causally probe multilingual language models (XGLM and multilingual BERT) across various languages.
We find significant neuron overlap across languages in autoregressive multilingual language models, but not masked language models.
arXiv Detail & Related papers (2022-10-25T20:43:36Z) - Multilingual BERT has an accent: Evaluating English influences on
fluency in multilingual models [23.62852626011989]
We show that grammatical structures in higher-resource languages bleed into lower-resource languages.
We show this bias via a novel method for comparing the fluency of multilingual models to the fluency of monolingual Spanish and Greek models.
arXiv Detail & Related papers (2022-10-11T17:06:38Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - Factual Consistency of Multilingual Pretrained Language Models [0.0]
We investigate whether multilingual language models are more consistent than their monolingual counterparts.
We find that mBERT is as inconsistent as English BERT in English paraphrases.
Both mBERT and XLM-R exhibit a high degree of inconsistency in English and even more so for all the other 45 languages.
arXiv Detail & Related papers (2022-03-22T09:15:53Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - Language Models are Few-shot Multilingual Learners [66.11011385895195]
We evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages.
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones.
arXiv Detail & Related papers (2021-09-16T03:08:22Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - How Good is Your Tokenizer? On the Monolingual Performance of
Multilingual Language Models [96.32118305166412]
We study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual downstream tasks.
We find that languages which are adequately represented in the multilingual model's vocabulary exhibit negligible performance decreases over their monolingual counterparts.
arXiv Detail & Related papers (2020-12-31T14:11:00Z) - Recurrent Neural Network Language Models Always Learn English-Like
Relative Clause Attachment [17.995905582226463]
We compare model performance in English and Spanish to show that non-linguistic biases in RNN LMs advantageously overlap with syntactic structure in English but not Spanish.
English models may appear to acquire human-like syntactic preferences, while models trained on Spanish fail to acquire comparable human-like preferences.
arXiv Detail & Related papers (2020-05-01T01:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.