XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
- URL: http://arxiv.org/abs/2106.16138v1
- Date: Wed, 30 Jun 2021 15:45:07 GMT
- Title: XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
- Authors: Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Saksham Singhal, Payal
Bajaj, Xia Song, Furu Wei
- Abstract summary: We pretrain the model, named as XLM-E, on both multilingual and parallel corpora.
Our model outperforms the baseline models on various cross-lingual understanding tasks with much less cost.
- Score: 46.80613153602189
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce ELECTRA-style tasks to cross-lingual language
model pre-training. Specifically, we present two pre-training tasks, namely
multilingual replaced token detection, and translation replaced token
detection. Besides, we pretrain the model, named as XLM-E, on both multilingual
and parallel corpora. Our model outperforms the baseline models on various
cross-lingual understanding tasks with much less computation cost. Moreover,
analysis shows that XLM-E tends to obtain better cross-lingual transferability.
Related papers
- Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models [110.10545153845051]
Cross-lingual Expert Language Models (X-ELM) is a process that specializes X-ELMs to different languages while remaining effective as a multilingual ensemble.
X-ELM provides additional benefits over performance improvements: new experts can be iteratively added, adapting X-ELM to new languages without catastrophic forgetting.
arXiv Detail & Related papers (2024-01-19T01:07:50Z) - KBioXLM: A Knowledge-anchored Biomedical Multilingual Pretrained
Language Model [37.69464822182714]
Most biomedical pretrained language models are monolingual and cannot handle the growing cross-lingual requirements.
We propose a model called KBioXLM, which transforms the multilingual pretrained model XLM-R into the biomedical domain using a knowledge-anchored approach.
arXiv Detail & Related papers (2023-11-20T07:02:35Z) - VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language
Understanding [24.66203356497508]
We propose XLA-MAML, which performs direct cross-lingual adaption in the meta-learning stage.
We conduct zero-shot and few-shot experiments on Natural Language Inference and Question Answering.
arXiv Detail & Related papers (2021-11-10T16:53:50Z) - XLM-K: Improving Cross-Lingual Language Model Pre-Training with
Multilingual Knowledge [31.765178013933134]
Cross-lingual pre-training has achieved great successes using monolingual and bilingual plain text corpora.
We propose XLM-K, a cross-lingual language model incorporating multilingual knowledge in pre-training.
arXiv Detail & Related papers (2021-09-26T11:46:20Z) - X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural
Language Understanding and Question Answering [55.57776147848929]
We propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for Natural Language Understanding (NLU)
Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages.
We show that our approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages.
arXiv Detail & Related papers (2021-04-20T00:13:35Z) - InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language
Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training.
We propose a new pre-training task based on contrastive learning.
By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.