Contrastive Learning for Many-to-many Multilingual Neural Machine
Translation
- URL: http://arxiv.org/abs/2105.09501v1
- Date: Thu, 20 May 2021 03:59:45 GMT
- Title: Contrastive Learning for Many-to-many Multilingual Neural Machine
Translation
- Authors: Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- Abstract summary: Existing multilingual machine translation approaches mainly focus on English-centric directions.
We aim to build a many-to-many translation system with an emphasis on the quality of non-English language directions.
- Score: 16.59039088482523
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing multilingual machine translation approaches mainly focus on
English-centric directions, while the non-English directions still lag behind.
In this work, we aim to build a many-to-many translation system with an
emphasis on the quality of non-English language directions. Our intuition is
based on the hypothesis that a universal cross-language representation leads to
better multilingual translation performance. To this end, we propose \method, a
training method to obtain a single unified multilingual translation model.
mCOLT is empowered by two techniques: (i) a contrastive learning scheme to
close the gap among representations of different languages, and (ii) data
augmentation on both multiple parallel and monolingual data to further align
token representations. For English-centric directions, mCOLT achieves
competitive or even better performance than a strong pre-trained model mBART on
tens of WMT benchmarks. For non-English directions, mCOLT achieves an
improvement of average 10+ BLEU compared with the multilingual baseline.
Related papers
- Revisiting Machine Translation for Cross-lingual Classification [91.43729067874503]
Most research in the area focuses on the multilingual models rather than the Machine Translation component.
We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed.
arXiv Detail & Related papers (2023-05-23T16:56:10Z) - Building Multilingual Machine Translation Systems That Serve Arbitrary
X-Y Translations [75.73028056136778]
We show how to practically build MNMT systems that serve arbitrary X-Y translation directions.
We also examine our proposed approach in an extremely large-scale data setting to accommodate practical deployment scenarios.
arXiv Detail & Related papers (2022-06-30T02:18:15Z) - Towards the Next 1000 Languages in Multilingual Machine Translation:
Exploring the Synergy Between Supervised and Self-Supervised Learning [48.15259834021655]
We present a pragmatic approach towards building a multilingual machine translation model that covers hundreds of languages.
We use a mixture of supervised and self-supervised objectives, depending on the data availability for different language pairs.
We demonstrate that the synergy between these two training paradigms enables the model to produce high-quality translations in the zero-resource setting.
arXiv Detail & Related papers (2022-01-09T23:36:44Z) - Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural
Machine Translation [53.22775597051498]
We present a continual pre-training framework on mBART to effectively adapt it to unseen languages.
Results show that our method can consistently improve the fine-tuning performance upon the mBART baseline.
Our approach also boosts the performance on translation pairs where both languages are seen in the original mBART's pre-training.
arXiv Detail & Related papers (2021-05-09T14:49:07Z) - Multi-task Learning for Multilingual Neural Machine Translation [32.81785430242313]
We propose a multi-task learning framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data.
We show that the proposed approach can effectively improve the translation quality for both high-resource and low-resource languages.
arXiv Detail & Related papers (2020-10-06T06:54:12Z) - InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language
Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training.
We propose a new pre-training task based on contrastive learning.
By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.