XLM-K: Improving Cross-Lingual Language Model Pre-Training with
Multilingual Knowledge
- URL: http://arxiv.org/abs/2109.12573v1
- Date: Sun, 26 Sep 2021 11:46:20 GMT
- Title: XLM-K: Improving Cross-Lingual Language Model Pre-Training with
Multilingual Knowledge
- Authors: Xiaoze Jiang, Yaobo Liang, Weizhu Chen, Nan Duan
- Abstract summary: Cross-lingual pre-training has achieved great successes using monolingual and bilingual plain text corpora.
We propose XLM-K, a cross-lingual language model incorporating multilingual knowledge in pre-training.
- Score: 31.765178013933134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-lingual pre-training has achieved great successes using monolingual and
bilingual plain text corpora. However, existing pre-trained models neglect
multilingual knowledge, which is language agnostic but comprises abundant
cross-lingual structure alignment. In this paper, we propose XLM-K, a
cross-lingual language model incorporating multilingual knowledge in
pre-training. XLM-K augments existing multilingual pre-training with two
knowledge tasks, namely Masked Entity Prediction Task and Object Entailment
Task. We evaluate XLM-K on MLQA, NER and XNLI. Experimental results clearly
demonstrate significant improvements over existing multilingual language
models. The results on MLQA and NER exhibit the superiority of XLM-K in
knowledge related tasks. The success in XNLI shows a better cross-lingual
transferability obtained in XLM-K. What is more, we provide a detailed probing
analysis to confirm the desired knowledge captured in our pre-training regimen.
Related papers
- Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models [110.10545153845051]
Cross-lingual Expert Language Models (X-ELM) is a process that specializes X-ELMs to different languages while remaining effective as a multilingual ensemble.
X-ELM provides additional benefits over performance improvements: new experts can be iteratively added, adapting X-ELM to new languages without catastrophic forgetting.
arXiv Detail & Related papers (2024-01-19T01:07:50Z) - KBioXLM: A Knowledge-anchored Biomedical Multilingual Pretrained
Language Model [37.69464822182714]
Most biomedical pretrained language models are monolingual and cannot handle the growing cross-lingual requirements.
We propose a model called KBioXLM, which transforms the multilingual pretrained model XLM-R into the biomedical domain using a knowledge-anchored approach.
arXiv Detail & Related papers (2023-11-20T07:02:35Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked
Language Models [100.29953199404905]
We introduce a new approach for scaling to very large multilingual vocabularies by de-emphasizing token sharing between languages with little lexical overlap.
We train XLM-V, a multilingual language model with a one million token vocabulary.
XLM-V is particularly effective on low-resource language tasks and outperforms XLM-R by 11.2% and 5.8% absolute on MasakhaNER and Americas NLI, respectively.
arXiv Detail & Related papers (2023-01-25T09:15:17Z) - A Primer on Pretrained Multilingual Language Models [18.943173499882885]
Multilingual Language Models (MLLMs) have emerged as a viable option for bringing the power of pretraining to a large number of languages.
We review the existing literature covering the above broad areas of research pertaining to MLLMs.
arXiv Detail & Related papers (2021-07-01T18:01:46Z) - XLM-E: Cross-lingual Language Model Pre-training via ELECTRA [46.80613153602189]
We pretrain the model, named as XLM-E, on both multilingual and parallel corpora.
Our model outperforms the baseline models on various cross-lingual understanding tasks with much less cost.
arXiv Detail & Related papers (2021-06-30T15:45:07Z) - X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural
Language Understanding and Question Answering [55.57776147848929]
We propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for Natural Language Understanding (NLU)
Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages.
We show that our approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages.
arXiv Detail & Related papers (2021-04-20T00:13:35Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.