Assessing the Syntactic Capabilities of Transformer-based Multilingual
Language Models
- URL: http://arxiv.org/abs/2105.04688v1
- Date: Mon, 10 May 2021 22:06:53 GMT
- Title: Assessing the Syntactic Capabilities of Transformer-based Multilingual
Language Models
- Authors: Laura P\'erez-Mayos, Alba T\'aboas Garc\'ia, Simon Mille, Leo Wanner
- Abstract summary: We explore the syntactic generalization capabilities of the monolingual and multilingual versions of BERT and RoBERTa.
More specifically, we evaluate the syntactic generalization potential of the models on English and Spanish tests.
For English, we use the available SyntaxGym test suite; for Spanish, we introduce SyntaxGymES, a novel ensemble of targeted syntactic tests in Spanish.
- Score: 4.590895888766304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multilingual Transformer-based language models, usually pretrained on more
than 100 languages, have been shown to achieve outstanding results in a wide
range of cross-lingual transfer tasks. However, it remains unknown whether the
optimization for different languages conditions the capacity of the models to
generalize over syntactic structures, and how languages with syntactic
phenomena of different complexity are affected. In this work, we explore the
syntactic generalization capabilities of the monolingual and multilingual
versions of BERT and RoBERTa. More specifically, we evaluate the syntactic
generalization potential of the models on English and Spanish tests, comparing
the syntactic abilities of monolingual and multilingual models on the same
language (English), and of multilingual models on two different languages
(English and Spanish). For English, we use the available SyntaxGym test suite;
for Spanish, we introduce SyntaxGymES, a novel ensemble of targeted syntactic
tests in Spanish, designed to evaluate the syntactic generalization
capabilities of language models through the SyntaxGym online platform.
Related papers
- Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement [1.4335183427838039]
We take the approach of developing curated synthetic data on a large scale, with specific properties.
We use a new multiple-choice task and datasets, Blackbird Language Matrices, to focus on a specific grammatical structural phenomenon.
We show that despite having been trained on multilingual texts in a consistent manner, multilingual pretrained language models have language-specific differences.
arXiv Detail & Related papers (2024-09-10T14:58:55Z) - Multilingual BERT has an accent: Evaluating English influences on
fluency in multilingual models [23.62852626011989]
We show that grammatical structures in higher-resource languages bleed into lower-resource languages.
We show this bias via a novel method for comparing the fluency of multilingual models to the fluency of monolingual Spanish and Greek models.
arXiv Detail & Related papers (2022-10-11T17:06:38Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - Specializing Multilingual Language Models: An Empirical Study [50.7526245872855]
Contextualized word representations from pretrained multilingual language models have become the de facto standard for addressing natural language tasks.
For languages rarely or never seen by these models, directly using such models often results in suboptimal representation or use of data.
arXiv Detail & Related papers (2021-06-16T18:13:55Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - Cross-Linguistic Syntactic Evaluation of Word Prediction Models [25.39896327641704]
We investigate how neural word prediction models' ability to learn syntax varies by language.
CLAMS includes subject-verb agreement challenge sets for English, French, German, Hebrew and Russian.
We use CLAMS to evaluate LSTM language models as well as monolingual and multilingual BERT.
arXiv Detail & Related papers (2020-05-01T02:51:20Z) - Linguistic Typology Features from Text: Inferring the Sparse Features of
World Atlas of Language Structures [73.06435180872293]
We construct a recurrent neural network predictor based on byte embeddings and convolutional layers.
We show that some features from various linguistic types can be predicted reliably.
arXiv Detail & Related papers (2020-04-30T21:00:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.