Improving the Lexical Ability of Pretrained Language Models for
Unsupervised Neural Machine Translation
- URL: http://arxiv.org/abs/2103.10531v1
- Date: Thu, 18 Mar 2021 21:17:58 GMT
- Title: Improving the Lexical Ability of Pretrained Language Models for
Unsupervised Neural Machine Translation
- Authors: Alexandra Chronopoulou, Dario Stojanovski and Alexander Fraser
- Abstract summary: Cross-lingual pretraining requires models to align the lexical- and high-level representations of the two languages.
Previous research has shown that this is because the representations are not sufficiently aligned.
In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings.
- Score: 127.81351683335143
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Successful methods for unsupervised neural machine translation (UNMT) employ
cross-lingual pretraining via self-supervision, often in the form of a masked
language modeling or a sequence generation task, which requires the model to
align the lexical- and high-level representations of the two languages. While
cross-lingual pretraining works for similar languages with abundant corpora, it
performs poorly in low-resource, distant languages. Previous research has shown
that this is because the representations are not sufficiently aligned. In this
paper, we enhance the bilingual masked language model pretraining with
lexical-level information by using type-level cross-lingual subword embeddings.
Empirical results demonstrate improved performance both on UNMT (up to 4.5
BLEU) and bilingual lexicon induction using our method compared to an
established UNMT baseline.
Related papers
- Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Learning to translate by learning to communicate [11.43638897327485]
We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems.
In our approach, we embed a multilingual model into an EC image-reference game, in which the model is incentivized to use multilingual generations to accomplish a vision-grounded task.
We present two variants of EC Fine-Tuning (Steinert-Threlkeld et al., 2022), one of which outperforms a backtranslation-only baseline in all four languages investigated.
arXiv Detail & Related papers (2022-07-14T15:58:06Z) - Bilingual Alignment Pre-training for Zero-shot Cross-lingual Transfer [33.680292990007366]
In this paper, we aim to improve the zero-shot cross-lingual transfer performance by aligning the embeddings better.
We propose a pre-training task named Alignment Language Model (AlignLM) which uses the statistical alignment information as the prior knowledge to guide bilingual word prediction.
The results show AlignLM can improve the zero-shot performance significantly on MLQA and XNLI datasets.
arXiv Detail & Related papers (2021-06-03T10:18:43Z) - ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual
Semantics with Monolingual Corpora [21.78571365050787]
ERNIE-M is a new training method that encourages the model to align the representation of multiple languages with monolingual corpora.
We generate pseudo-parallel sentences pairs on a monolingual corpus to enable the learning of semantic alignment between different languages.
Experimental results show that ERNIE-M outperforms existing cross-lingual models and delivers new state-of-the-art results on various cross-lingual downstream tasks.
arXiv Detail & Related papers (2020-12-31T15:52:27Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z) - Cross-lingual Spoken Language Understanding with Regularized
Representation Alignment [71.53159402053392]
We propose a regularization approach to align word-level and sentence-level representations across languages without any external resource.
Experiments on the cross-lingual spoken language understanding task show that our model outperforms current state-of-the-art methods in both few-shot and zero-shot scenarios.
arXiv Detail & Related papers (2020-09-30T08:56:53Z) - Reusing a Pretrained Language Model on Languages with Limited Corpora
for Unsupervised NMT [129.99918589405675]
We present an effective approach that reuses an LM that is pretrained only on the high-resource language.
The monolingual LM is fine-tuned on both languages and is then used to initialize a UNMT model.
Our approach, RE-LM, outperforms a competitive cross-lingual pretraining model (XLM) in English-Macedonian (En-Mk) and English-Albanian (En-Sq)
arXiv Detail & Related papers (2020-09-16T11:37:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.