High-resource Language-specific Training for Multilingual Neural Machine
Translation
- URL: http://arxiv.org/abs/2207.04906v1
- Date: Mon, 11 Jul 2022 14:33:13 GMT
- Title: High-resource Language-specific Training for Multilingual Neural Machine
Translation
- Authors: Jian Yang, Yuwei Yin, Shuming Ma, Dongdong Zhang, Zhoujun Li, Furu Wei
- Abstract summary: We propose the multilingual translation model with the high-resource language-specific training (HLT-MT) to alleviate the negative interference.
Specifically, we first train the multilingual model only with the high-resource pairs and select the language-specific modules at the top of the decoder.
HLT-MT is further trained on all available corpora to transfer knowledge from high-resource languages to low-resource languages.
- Score: 109.31892935605192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multilingual neural machine translation (MNMT) trained in multiple language
pairs has attracted considerable attention due to fewer model parameters and
lower training costs by sharing knowledge among multiple languages.
Nonetheless, multilingual training is plagued by language interference
degeneration in shared parameters because of the negative interference among
different translation directions, especially on high-resource languages. In
this paper, we propose the multilingual translation model with the
high-resource language-specific training (HLT-MT) to alleviate the negative
interference, which adopts the two-stage training with the language-specific
selection mechanism. Specifically, we first train the multilingual model only
with the high-resource pairs and select the language-specific modules at the
top of the decoder to enhance the translation quality of high-resource
directions. Next, the model is further trained on all available corpora to
transfer knowledge from high-resource languages (HRLs) to low-resource
languages (LRLs). Experimental results show that HLT-MT outperforms various
strong baselines on WMT-10 and OPUS-100 benchmarks. Furthermore, the analytic
experiments validate the effectiveness of our method in mitigating the negative
interference in multilingual training.
Related papers
- Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models [104.96990850774566]
We propose a Multi-lingual Ability Extraction and Transfer approach, named as MAET.
Our key idea is to decompose and extract language-agnostic ability-related weights from large language models.
Experiment results show MAET can effectively and efficiently extract and transfer the advanced abilities, and outperform training-based baseline methods.
arXiv Detail & Related papers (2024-10-10T11:23:18Z) - Targeted Multilingual Adaptation for Low-resource Language Families [17.212424929235624]
We study best practices for adapting a pre-trained model to a language family.
Our adapted models significantly outperform mono- and multilingual baselines.
Low-resource languages can be aggressively up-sampled during training at little detriment to performance in high-resource languages.
arXiv Detail & Related papers (2024-05-20T23:38:06Z) - Revisiting Machine Translation for Cross-lingual Classification [91.43729067874503]
Most research in the area focuses on the multilingual models rather than the Machine Translation component.
We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed.
arXiv Detail & Related papers (2023-05-23T16:56:10Z) - Towards the Next 1000 Languages in Multilingual Machine Translation:
Exploring the Synergy Between Supervised and Self-Supervised Learning [48.15259834021655]
We present a pragmatic approach towards building a multilingual machine translation model that covers hundreds of languages.
We use a mixture of supervised and self-supervised objectives, depending on the data availability for different language pairs.
We demonstrate that the synergy between these two training paradigms enables the model to produce high-quality translations in the zero-resource setting.
arXiv Detail & Related papers (2022-01-09T23:36:44Z) - Multilingual Neural Machine Translation:Can Linguistic Hierarchies Help? [29.01386302441015]
Multilingual Neural Machine Translation (MNMT) trains a single NMT model that supports translation between multiple languages.
The performance of an MNMT model is highly dependent on the type of languages used in training, as transferring knowledge from a diverse set of languages degrades the translation performance due to negative transfer.
We propose a Hierarchical Knowledge Distillation (HKD) approach for MNMT which capitalises on language groups generated according to typological features and phylogeny of languages to overcome the issue of negative transfer.
arXiv Detail & Related papers (2021-10-15T02:31:48Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.