Improving Zero-shot Multilingual Neural Machine Translation for
Low-Resource Languages
- URL: http://arxiv.org/abs/2110.00712v1
- Date: Sat, 2 Oct 2021 02:50:53 GMT
- Title: Improving Zero-shot Multilingual Neural Machine Translation for
Low-Resource Languages
- Authors: Chenyang Li, Gongxu Luo
- Abstract summary: We propose the tagged-multilingual NMT model and improve the self-learning algorithm to handle these two problems.
Experimental results on IWSLT show that the adjusted tagged-multilingual NMT separately obtains 9.41 and 7.85 BLEU scores over the multilingual NMT.
- Score: 1.0965065178451106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although the multilingual Neural Machine Translation(NMT), which extends
Google's multilingual NMT, has ability to perform zero-shot translation and the
iterative self-learning algorithm can improve the quality of zero-shot
translation, it confronts with two problems: the multilingual NMT model is
prone to generate wrong target language when implementing zero-shot
translation; the self-learning algorithm, which uses beam search to generate
synthetic parallel data, demolishes the diversity of the generated source
language and amplifies the impact of the same noise during the iterative
learning process. In this paper, we propose the tagged-multilingual NMT model
and improve the self-learning algorithm to handle these two problems. Firstly,
we extend the Google's multilingual NMT model and add target tokens to the
target languages, which associates the start tag with the target language to
ensure that the source language can be translated to the required target
language. Secondly, we improve the self-learning algorithm by replacing beam
search with random sample to increases the diversity of the generated data and
makes it properly cover the true data distribution. Experimental results on
IWSLT show that the adjusted tagged-multilingual NMT separately obtains 9.41
and 7.85 BLEU scores over the multilingual NMT on 2010 and 2017
Romanian-Italian test sets. Similarly, it obtains 9.08 and 7.99 BLEU scores on
Italian-Romanian zero-shot translation. Furthermore, the improved self-learning
algorithm shows its superiorities over the conventional self-learning algorithm
on zero-shot translations.
Related papers
- Optimizing the Training Schedule of Multilingual NMT using Reinforcement Learning [0.3277163122167433]
We propose two algorithms that use reinforcement learning to optimize the training schedule of Multilingual NMT.
On a 8-to-1 translation dataset with LRLs and HRLs, our second method improves BLEU and COMET scores with respect to both random selection of monolingual batches and shuffled multilingual batches.
arXiv Detail & Related papers (2024-10-08T15:20:13Z) - Language-Informed Beam Search Decoding for Multilingual Machine Translation [24.044315362087687]
Language-informed Beam Search (LiBS) is a general decoding algorithm incorporating an off-the-shelf Language Identification (LiD) model into beam search decoding to reduce off-target translations.
Results show that our proposed LiBS algorithm on average improves +1.1 BLEU and +0.9 BLEU on WMT and OPUS datasets, and reduces off-target rates from 22.9% to 7.7% and 65.8% to 25.3% respectively.
arXiv Detail & Related papers (2024-08-11T09:57:46Z) - Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural
Machine Translation [74.158365847236]
SixT++ is a strong many-to-English NMT model that supports 100 source languages but is trained once with a parallel dataset from only six source languages.
It significantly outperforms CRISS and m2m-100, two strong multilingual NMT systems, with an average gain of 7.2 and 5.0 BLEU respectively.
arXiv Detail & Related papers (2021-10-16T10:59:39Z) - Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural
Machine Translation [53.22775597051498]
We present a continual pre-training framework on mBART to effectively adapt it to unseen languages.
Results show that our method can consistently improve the fine-tuning performance upon the mBART baseline.
Our approach also boosts the performance on translation pairs where both languages are seen in the original mBART's pre-training.
arXiv Detail & Related papers (2021-05-09T14:49:07Z) - Zero-shot Cross-lingual Transfer of Neural Machine Translation with
Multilingual Pretrained Encoders [74.89326277221072]
How to improve the cross-lingual transfer of NMT model with multilingual pretrained encoder is under-explored.
We propose SixT, a simple yet effective model for this task.
Our model achieves better performance on many-to-English testsets than CRISS and m2m-100.
arXiv Detail & Related papers (2021-04-18T07:42:45Z) - Self-Learning for Zero Shot Neural Machine Translation [13.551731309506874]
This work proposes a novel zero-shot NMT modeling approach that learns without the now-standard assumption of a pivot language sharing parallel data.
Compared to unsupervised NMT, consistent improvements are observed even in a domain-mismatch setting.
arXiv Detail & Related papers (2021-03-10T09:15:19Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - Improving Target-side Lexical Transfer in Multilingual Neural Machine
Translation [104.10726545151043]
multilingual data has been found more beneficial for NMT models that translate from the LRL to a target language than the ones that translate into the LRLs.
Our experiments show that DecSDE leads to consistent gains of up to 1.8 BLEU on translation from English to four different languages.
arXiv Detail & Related papers (2020-10-04T19:42:40Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.