Simulated Multiple Reference Training Improves Low-Resource Machine
Translation
- URL: http://arxiv.org/abs/2004.14524v2
- Date: Tue, 13 Oct 2020 15:43:57 GMT
- Title: Simulated Multiple Reference Training Improves Low-Resource Machine
Translation
- Authors: Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn
- Abstract summary: We introduce Simulated Multiple Reference Training (SMRT), a novel MT training method that approximates the full space of possible translations.
We demonstrate the effectiveness of SMRT in low-resource settings when translating to English, with improvements of 1.2 to 7.0 BLEU.
- Score: 22.404646693366054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many valid translations exist for a given sentence, yet machine translation
(MT) is trained with a single reference translation, exacerbating data sparsity
in low-resource settings. We introduce Simulated Multiple Reference Training
(SMRT), a novel MT training method that approximates the full space of possible
translations by sampling a paraphrase of the reference sentence from a
paraphraser and training the MT model to predict the paraphraser's distribution
over possible tokens. We demonstrate the effectiveness of SMRT in low-resource
settings when translating to English, with improvements of 1.2 to 7.0 BLEU. We
also find SMRT is complementary to back-translation.
Related papers
- IntGrad MT: Eliciting LLMs' Machine Translation Capabilities with Sentence Interpolation and Gradual MT [5.323504404265276]
Large Language Models (LLMs) have demonstrated strong performance in translation without needing to be finetuned on additional parallel corpora.
Previous works have focused on mitigating this issue by leveraging relevant few-shot examples or external resources such as dictionaries or grammar books.
We propose a novel method named IntGrad MT that focuses on fully exploiting an LLM's inherent translation capability.
arXiv Detail & Related papers (2024-10-15T15:26:28Z) - Choose the Final Translation from NMT and LLM hypotheses Using MBR Decoding: HW-TSC's Submission to the WMT24 General MT Shared Task [9.819139035652137]
This paper presents the submission of Huawei Translate Services Center (HW-TSC) to the WMT24 general machine translation (MT) shared task.
We use training strategies such as regularized dropout, bidirectional training, data diversification, forward translation, back translation, alternated training, curriculum learning, and transductive ensemble learning to train the neural machine translation (NMT) model.
arXiv Detail & Related papers (2024-09-23T08:25:37Z) - Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages.
Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs.
In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z) - Boosting Unsupervised Machine Translation with Pseudo-Parallel Data [2.900810893770134]
We propose a training strategy that relies on pseudo-parallel sentence pairs mined from monolingual corpora and synthetic sentence pairs back-translated from monolingual corpora.
We reach an improvement of up to 14.5 BLEU points (English to Ukrainian) over a baseline trained on back-translated data only.
arXiv Detail & Related papers (2023-10-22T10:57:12Z) - Dictionary-based Phrase-level Prompting of Large Language Models for
Machine Translation [91.57514888410205]
Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting.
LLMs can struggle to translate inputs with rare words, which are common in low resource or domain transfer scenarios.
We show that LLM prompting can provide an effective solution for rare words as well, by using prior knowledge from bilingual dictionaries to provide control hints in the prompts.
arXiv Detail & Related papers (2023-02-15T18:46:42Z) - BitextEdit: Automatic Bitext Editing for Improved Low-Resource Machine
Translation [53.55009917938002]
We propose to refine the mined bitexts via automatic editing.
Experiments demonstrate that our approach successfully improves the quality of CCMatrix mined bitext for 5 low-resource language-pairs and 10 translation directions by up to 8 BLEU points.
arXiv Detail & Related papers (2021-11-12T16:00:39Z) - Self-supervised and Supervised Joint Training for Resource-rich Machine
Translation [30.502625878505732]
Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT)
We propose a joint training approach, $F$-XEnDec, to combine self-supervised and supervised learning to optimize NMT models.
arXiv Detail & Related papers (2021-06-08T02:35:40Z) - Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural
Machine Translation [53.22775597051498]
We present a continual pre-training framework on mBART to effectively adapt it to unseen languages.
Results show that our method can consistently improve the fine-tuning performance upon the mBART baseline.
Our approach also boosts the performance on translation pairs where both languages are seen in the original mBART's pre-training.
arXiv Detail & Related papers (2021-05-09T14:49:07Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.