On the ability of monolingual models to learn language-agnostic
representations
- URL: http://arxiv.org/abs/2109.01942v1
- Date: Sat, 4 Sep 2021 22:09:44 GMT
- Title: On the ability of monolingual models to learn language-agnostic
representations
- Authors: Leandro Rodrigues de Souza, Rodrigo Nogueira, Roberto Lotufo
- Abstract summary: We show that monolingual models pretrained and finetuned on different languages achieve competitive performance.
For example, models pretrained on distant languages such as German and Portuguese perform similarly on English tasks.
- Score: 2.604227467422371
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pretrained multilingual models have become a de facto default approach for
zero-shot cross-lingual transfer. Previous work has shown that these models are
able to achieve cross-lingual representations when pretrained on two or more
languages with shared parameters. In this work, we provide evidence that a
model can achieve language-agnostic representations even when pretrained on a
single language. That is, we find that monolingual models pretrained and
finetuned on different languages achieve competitive performance compared to
the ones that use the same target language. Surprisingly, the models show a
similar performance on a same task regardless of the pretraining language. For
example, models pretrained on distant languages such as German and Portuguese
perform similarly on English tasks.
Related papers
- The Less the Merrier? Investigating Language Representation in
Multilingual Models [8.632506864465501]
We investigate the linguistic representation of different languages in multilingual models.
We observe from our experiments that community-centered models perform better at distinguishing between languages in the same family for low-resource languages.
arXiv Detail & Related papers (2023-10-20T02:26:34Z) - Learning Cross-lingual Visual Speech Representations [108.68531445641769]
Cross-lingual self-supervised visual representation learning has been a growing research topic in the last few years.
We use the recently-proposed Raw Audio-Visual Speechs (RAVEn) framework to pre-train an audio-visual model with unlabelled data.
Our experiments show that: (1) multi-lingual models with more data outperform monolingual ones, but, when keeping the amount of data fixed, monolingual models tend to reach better performance.
arXiv Detail & Related papers (2023-03-14T17:05:08Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - Lifting the Curse of Multilinguality by Pre-training Modular
Transformers [72.46919537293068]
multilingual pre-trained models suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages.
We introduce language-specific modules, which allows us to grow the total capacity of the model, while keeping the total number of trainable parameters per language constant.
Our approach enables adding languages post-hoc with no measurable drop in performance, no longer limiting the model usage to the set of pre-trained languages.
arXiv Detail & Related papers (2022-05-12T17:59:56Z) - Language Models are Few-shot Multilingual Learners [66.11011385895195]
We evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages.
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones.
arXiv Detail & Related papers (2021-09-16T03:08:22Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - How Good is Your Tokenizer? On the Monolingual Performance of
Multilingual Language Models [96.32118305166412]
We study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual downstream tasks.
We find that languages which are adequately represented in the multilingual model's vocabulary exhibit negligible performance decreases over their monolingual counterparts.
arXiv Detail & Related papers (2020-12-31T14:11:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.