Robust Domain Adaptation for Pre-trained Multilingual Neural Machine
Translation Models
- URL: http://arxiv.org/abs/2210.14979v1
- Date: Wed, 26 Oct 2022 18:47:45 GMT
- Title: Robust Domain Adaptation for Pre-trained Multilingual Neural Machine
Translation Models
- Authors: Mathieu Grosso, Pirashanth Ratnamogan, Alexis Mathey, William
Vanhuffel, Michael Fotso Fotso
- Abstract summary: We propose a fine-tuning procedure for the generic mNMT that combines embeddings freezing and adversarial loss.
Experiments demonstrated that the procedure improves performances on specialized data with a minimal loss in initial performances on generic domain for all languages pairs.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent literature has demonstrated the potential of multilingual Neural
Machine Translation (mNMT) models. However, the most efficient models are not
well suited to specialized industries. In these cases, internal data is scarce
and expensive to find in all language pairs. Therefore, fine-tuning a mNMT
model on a specialized domain is hard. In this context, we decided to focus on
a new task: Domain Adaptation of a pre-trained mNMT model on a single pair of
language while trying to maintain model quality on generic domain data for all
language pairs. The risk of loss on generic domain and on other pairs is high.
This task is key for mNMT model adoption in the industry and is at the border
of many others. We propose a fine-tuning procedure for the generic mNMT that
combines embeddings freezing and adversarial loss. Our experiments demonstrated
that the procedure improves performances on specialized data with a minimal
loss in initial performances on generic domain for all languages pairs,
compared to a naive standard approach (+10.0 BLEU score on specialized data,
-0.01 to -0.5 BLEU on WMT and Tatoeba datasets on the other pairs with M2M100).
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Exploiting Language Relatedness in Machine Translation Through Domain
Adaptation Techniques [3.257358540764261]
We present a novel approach of using a scaled similarity score of sentences, especially for related languages based on a 5-gram KenLM language model.
Our approach succeeds in increasing 2 BLEU point on multi-domain approach, 3 BLEU point on fine-tuning for NMT and 2 BLEU point on iterative back-translation approach.
arXiv Detail & Related papers (2023-03-03T09:07:30Z) - Better Datastore, Better Translation: Generating Datastores from
Pre-Trained Models for Nearest Neural Machine Translation [48.58899349349702]
Nearest Neighbor Machine Translation (kNNMT) is a simple and effective method of augmenting neural machine translation (NMT) with a token-level nearest neighbor retrieval mechanism.
In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT.
arXiv Detail & Related papers (2022-12-17T08:34:20Z) - Data Selection Curriculum for Neural Machine Translation [31.55953464971441]
We introduce a two-stage curriculum training framework for NMT models.
We fine-tune a base NMT model on subsets of data, selected by both deterministic scoring using pre-trained methods and online scoring.
We have shown that our curriculum strategies consistently demonstrate better quality (up to +2.2 BLEU improvement) and faster convergence.
arXiv Detail & Related papers (2022-03-25T19:08:30Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Synthesizing Monolingual Data for Neural Machine Translation [22.031658738184166]
In neural machine translation (NMT), monolingual data in the target language are usually exploited to synthesize additional training parallel data.
Large monolingual data in the target domains or languages are not always available to generate large synthetic parallel data.
We propose a new method to generate large synthetic parallel data leveraging very small monolingual data in a specific domain.
arXiv Detail & Related papers (2021-01-29T08:17:40Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.