Self-Learning for Zero Shot Neural Machine Translation
- URL: http://arxiv.org/abs/2103.05951v1
- Date: Wed, 10 Mar 2021 09:15:19 GMT
- Title: Self-Learning for Zero Shot Neural Machine Translation
- Authors: Surafel M. Lakew, Matteo Negri, Marco Turchi
- Abstract summary: This work proposes a novel zero-shot NMT modeling approach that learns without the now-standard assumption of a pivot language sharing parallel data.
Compared to unsupervised NMT, consistent improvements are observed even in a domain-mismatch setting.
- Score: 13.551731309506874
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Machine Translation (NMT) approaches employing monolingual data are
showing steady improvements in resource rich conditions. However, evaluations
using real-world low-resource languages still result in unsatisfactory
performance. This work proposes a novel zero-shot NMT modeling approach that
learns without the now-standard assumption of a pivot language sharing parallel
data with the zero-shot source and target languages. Our approach is based on
three stages: initialization from any pre-trained NMT model observing at least
the target language, augmentation of source sides leveraging target monolingual
data, and learning to optimize the initial model to the zero-shot pair, where
the latter two constitute a self-learning cycle. Empirical findings involving
four diverse (in terms of a language family, script and relatedness) zero-shot
pairs show the effectiveness of our approach with up to +5.93 BLEU improvement
against a supervised bilingual baseline. Compared to unsupervised NMT,
consistent improvements are observed even in a domain-mismatch setting,
attesting to the usability of our method.
Related papers
- Machine Translation for Ge'ez Language [0.0]
Machine translation for low-resource languages such as Ge'ez faces challenges such as out-of-vocabulary words, domain mismatches, and lack of labeled training data.
We develop a multilingual neural machine translation (MNMT) model based on languages relatedness.
We also experiment with using GPT-3.5, a state-of-the-art LLM, for few-shot translation with fuzzy matches.
arXiv Detail & Related papers (2023-11-24T14:55:23Z) - Exploiting Multilingualism in Low-resource Neural Machine Translation
via Adversarial Learning [3.2258463207097017]
Generative Adversarial Networks (GAN) offer a promising approach for Neural Machine Translation (NMT)
In GAN, similar to bilingual models, multilingual NMT only considers one reference translation for each sentence during model training.
This article proposes Denoising Adversarial Auto-encoder-based Sentence Interpolation (DAASI) approach to perform sentence computation.
arXiv Detail & Related papers (2023-03-31T12:34:14Z) - Improving Zero-shot Multilingual Neural Machine Translation for
Low-Resource Languages [1.0965065178451106]
We propose the tagged-multilingual NMT model and improve the self-learning algorithm to handle these two problems.
Experimental results on IWSLT show that the adjusted tagged-multilingual NMT separately obtains 9.41 and 7.85 BLEU scores over the multilingual NMT.
arXiv Detail & Related papers (2021-10-02T02:50:53Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Distributionally Robust Multilingual Machine Translation [94.51866646879337]
We propose a new learning objective for Multilingual neural machine translation (MNMT) based on distributionally robust optimization.
We show how to practically optimize this objective for large translation corpora using an iterated best response scheme.
Our method consistently outperforms strong baseline methods in terms of average and per-language performance under both many-to-one and one-to-many translation settings.
arXiv Detail & Related papers (2021-09-09T03:48:35Z) - AmericasNLI: Evaluating Zero-shot Natural Language Understanding of
Pretrained Multilingual Models in Truly Low-resource Languages [75.08199398141744]
We present AmericasNLI, an extension of XNLI (Conneau et al.), to 10 indigenous languages of the Americas.
We conduct experiments with XLM-R, testing multiple zero-shot and translation-based approaches.
We find that XLM-R's zero-shot performance is poor for all 10 languages, with an average performance of 38.62%.
arXiv Detail & Related papers (2021-04-18T05:32:28Z) - Harnessing Multilinguality in Unsupervised Machine Translation for Rare
Languages [48.28540903568198]
We show that multilinguality is critical to making unsupervised systems practical for low-resource settings.
We present a single model for 5 low-resource languages (Gujarati, Kazakh, Nepali, Sinhala, and Turkish) to and from English directions.
We outperform all current state-of-the-art unsupervised baselines for these languages, achieving gains of up to 14.4 BLEU.
arXiv Detail & Related papers (2020-09-23T15:07:33Z) - Leveraging Monolingual Data with Self-Supervision for Multilingual
Neural Machine Translation [54.52971020087777]
Using monolingual data significantly boosts the translation quality of low-resource languages in multilingual models.
Self-supervision improves zero-shot translation quality in multilingual models.
We get up to 33 BLEU on ro-en translation without any parallel data or back-translation.
arXiv Detail & Related papers (2020-05-11T00:20:33Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.