Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural
Machine Translation Training
- URL: http://arxiv.org/abs/2109.02284v1
- Date: Mon, 6 Sep 2021 08:30:33 GMT
- Title: Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural
Machine Translation Training
- Authors: Minghao Wu, Yitong Li, Meng Zhang, Liangyou Li, Gholamreza Haffari,
Qun Liu
- Abstract summary: MultiUAT dynamically adjusts the training data usage based on the model's uncertainty.
We analyze the cross-domain transfer and show the deficiency of static and similarity based methods.
- Score: 58.72619374790418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning multilingual and multi-domain translation model is challenging as
the heterogeneous and imbalanced data make the model converge inconsistently
over different corpora in real world. One common practice is to adjust the
share of each corpus in the training, so that the learning process is balanced
and low-resource cases can benefit from the high resource ones. However,
automatic balancing methods usually depend on the intra- and inter-dataset
characteristics, which is usually agnostic or requires human priors. In this
work, we propose an approach, MultiUAT, that dynamically adjusts the training
data usage based on the model's uncertainty on a small set of trusted clean
data for multi-corpus machine translation. We experiments with two classes of
uncertainty measures on multilingual (16 languages with 4 settings) and
multi-domain settings (4 for in-domain and 2 for out-of-domain on
English-German translation) and demonstrate our approach MultiUAT substantially
outperforms its baselines, including both static and dynamic strategies. We
analyze the cross-domain transfer and show the deficiency of static and
similarity based methods.
Related papers
- Relevance-guided Neural Machine Translation [5.691028372215281]
We propose an explainability-based training approach for Neural Machine Translation (NMT)
Our results show our method can be promising, particularly when training in low-resource conditions.
arXiv Detail & Related papers (2023-11-30T21:52:02Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Mitigating Data Imbalance and Representation Degeneration in
Multilingual Machine Translation [103.90963418039473]
Bi-ACL is a framework that uses only target-side monolingual data and a bilingual dictionary to improve the performance of the MNMT model.
We show that Bi-ACL is more effective both in long-tail languages and in high-resource languages.
arXiv Detail & Related papers (2023-05-22T07:31:08Z) - Finding the Right Recipe for Low Resource Domain Adaptation in Neural
Machine Translation [7.2283509416724465]
General translation models often struggle to generate accurate translations in specialized domains.
We conduct an in-depth empirical exploration of monolingual and parallel data approaches to domain adaptation.
Our work includes three domains: consumer electronic, clinical, and biomedical.
arXiv Detail & Related papers (2022-06-02T16:38:33Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Distributionally Robust Multilingual Machine Translation [94.51866646879337]
We propose a new learning objective for Multilingual neural machine translation (MNMT) based on distributionally robust optimization.
We show how to practically optimize this objective for large translation corpora using an iterated best response scheme.
Our method consistently outperforms strong baseline methods in terms of average and per-language performance under both many-to-one and one-to-many translation settings.
arXiv Detail & Related papers (2021-09-09T03:48:35Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - Balancing Training for Multilingual Neural Machine Translation [130.54253367251738]
multilingual machine translation (MT) models can translate to/from multiple languages.
Standard practice is to up-sample less resourced languages to increase representation.
We propose a method that instead automatically learns how to weight training data through a data scorer.
arXiv Detail & Related papers (2020-04-14T18:23:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.