Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models
- URL: http://arxiv.org/abs/2311.07439v3
- Date: Sun, 28 Apr 2024 08:26:11 GMT
- Title: Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models
- Authors: Alireza Mohammadshahi, Jannis Vamvas, Rico Sennrich,
- Abstract summary: We revisit ways of pivoting through multiple languages.
We propose MaxEns, a novel combination strategy that makes the output biased towards the most confident predictions.
On average, multi-pivot strategies still lag behind using English as a single pivot language.
- Score: 47.91306228406407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Massively multilingual machine translation models allow for the translation of a large number of languages with a single model, but have limited performance on low- and very-low-resource translation directions. Pivoting via high-resource languages remains a strong strategy for low-resource directions, and in this paper we revisit ways of pivoting through multiple languages. Previous work has used a simple averaging of probability distributions from multiple paths, but we find that this performs worse than using a single pivot, and exacerbates the hallucination problem because the same hallucinations can be probable across different paths. We also propose MaxEns, a novel combination strategy that makes the output biased towards the most confident predictions, hypothesising that confident predictions are less prone to be hallucinations. We evaluate different strategies on the FLORES benchmark for 20 low-resource language directions, demonstrating that MaxEns improves translation quality for low-resource languages while reducing hallucination in translations, compared to both direct translation and an averaging approach. On average, multi-pivot strategies still lag behind using English as a single pivot language, raising the question of how to identify the best pivoting strategy for a given translation direction.
Related papers
- A Single Model Ensemble Framework for Neural Machine Translation using Pivot Translation [1.3791394805787949]
We present a pivot-based single model ensemble for low-resource language pairs.
In the first step, we generate candidates through pivot translation.
Next, in the aggregation step, we select k high-quality candidates from the generated candidates and merge them to generate a final translation.
arXiv Detail & Related papers (2025-02-03T09:17:45Z) - Towards Neural No-Resource Language Translation: A Comparative Evaluation of Approaches [0.0]
No-resource languages - those with minimal or no digital representation - pose unique challenges for machine translation (MT)
Unlike low-resource languages, which rely on limited but existent corpora, no-resource languages often have fewer than 100 sentences available for training.
This work explores the problem of no-resource translation through three distinct approaches.
arXiv Detail & Related papers (2024-12-29T21:12:39Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Distributionally Robust Multilingual Machine Translation [94.51866646879337]
We propose a new learning objective for Multilingual neural machine translation (MNMT) based on distributionally robust optimization.
We show how to practically optimize this objective for large translation corpora using an iterated best response scheme.
Our method consistently outperforms strong baseline methods in terms of average and per-language performance under both many-to-one and one-to-many translation settings.
arXiv Detail & Related papers (2021-09-09T03:48:35Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - Self-Training Sampling with Monolingual Data Uncertainty for Neural
Machine Translation [98.83925811122795]
We propose to improve the sampling procedure by selecting the most informative monolingual sentences to complement the parallel data.
We compute the uncertainty of monolingual sentences using the bilingual dictionary extracted from the parallel data.
Experimental results on large-scale WMT English$Rightarrow$German and English$Rightarrow$Chinese datasets demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2021-06-02T05:01:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.