Related papers: Decoding Time Lexical Domain Adaptationfor Neural Machine Translation

Decoding Time Lexical Domain Adaptationfor Neural Machine Translation

URL: http://arxiv.org/abs/2101.00421v1
Date: Sat, 2 Jan 2021 11:06:15 GMT
Title: Decoding Time Lexical Domain Adaptationfor Neural Machine Translation
Authors: Nikolay Bogoychev and Pinzhen Chen
Abstract summary: Machine translation systems are vulnerable to domain mismatch, especially when the task is low-resource. We present two simple methods for improving translation quality in this particular setting.
Score: 7.628949147902029
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine translation systems are vulnerable to domain mismatch, especially when the task is low-resource. In this setting, out of domain translations are often of poor quality and prone to hallucinations, due to the translation model preferring to predict common words it has seen during training, as opposed to the more uncommon ones from a different domain. We present two simple methods for improving translation quality in this particular setting: First, we use lexical shortlisting in order to restrict the neural network predictions by IBM model computed alignments. Second, we perform $n$-best list reordering by reranking all translations based on the amount they overlap with each other. Our methods are computationally simpler and faster than alternative approaches, and show a moderate success on low-resource settings with explicit out of domain test sets. However, our methods lose their effectiveness when the domain mismatch is too great, or in high resource setting.

Related papers

Learning to Translate Ambiguous Terminology by Preference Optimization on Post-Edits [10.580610673031073]
In a corporate context, many examples of human post-edits of valid but incorrect terminology exist.<n>Our approach is based on preference optimization, using the term post-edit as the knowledge to be preferred.<n>We report results on English-German post-edited data and find that the optimal combination of supervised fine-tuning and preference optimization, with both term-specific and full sequence objectives, yields statistically significant improvements in term accuracy over a strong NMT baseline without significant losses in COMET score.
arXiv Detail & Related papers (2025-07-04T13:49:14Z)
Plug, Play, and Fuse: Zero-Shot Joint Decoding via Word-Level Re-ranking Across Diverse Vocabularies [12.843274390224853]
Real-world tasks, like multimodal translation, often require a combination of these strengths, such as handling both translation and image processing. We propose a novel zero-shot ensembling strategy that allows for the integration of different models during the decoding phase without the need for additional training. Our approach re-ranks beams during decoding by combining scores at the word level, using multimodals to predict when a word is completed.
arXiv Detail & Related papers (2024-08-21T04:20:55Z)
Non-Parametric Domain Adaptation for End-to-End Speech Translation [72.37869362559212]
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters. We propose a novel non-parametric method that leverages domain-specific text translation corpus to achieve domain adaptation for the E2E-ST system.
arXiv Detail & Related papers (2022-05-23T11:41:02Z)
DEEP: DEnoising Entity Pre-training for Neural Machine Translation [123.6686940355937]
It has been shown that machine translation models usually generate poor translations for named entities that are infrequent in the training corpus. We propose DEEP, a DEnoising Entity Pre-training method that leverages large amounts of monolingual data and a knowledge base to improve named entity translation accuracy within sentences.
arXiv Detail & Related papers (2021-11-14T17:28:09Z)
BitextEdit: Automatic Bitext Editing for Improved Low-Resource Machine Translation [53.55009917938002]
We propose to refine the mined bitexts via automatic editing. Experiments demonstrate that our approach successfully improves the quality of CCMatrix mined bitext for 5 low-resource language-pairs and 10 translation directions by up to 8 BLEU points.
arXiv Detail & Related papers (2021-11-12T16:00:39Z)
Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval. We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z)
Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model. We evaluate our novel latent translation-based model on a series of multilingual NLU tasks. We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z)
Phrase-level Active Learning for Neural Machine Translation [107.28450614074002]
We propose an active learning setting where we can spend a given budget on translating in-domain data. We select both full sentences and individual phrases from unlabelled data in the new domain for routing to human translators. In a German-English translation task, our active learning approach achieves consistent improvements over uncertainty-based sentence selection methods.
arXiv Detail & Related papers (2021-06-21T19:20:42Z)
Improving Lexically Constrained Neural Machine Translation with Source-Conditioned Masked Span Prediction [6.46964825569749]
In this paper, we tackle a more challenging setup consisting of domain-specific corpora with much longer n-gram and highly specialized terms. To encourage span-level representations in generation, we additionally impose a source-sentence conditioned masked span prediction loss in the decoder. Experimental results on three domain-specific corpora in two language pairs demonstrate that the proposed training scheme can improve the performance of existing lexically constrained methods.
arXiv Detail & Related papers (2021-05-12T08:11:33Z)
Rapid Domain Adaptation for Machine Translation with Monolingual Data [31.70276147485463]
One challenge of machine translation is how to quickly adapt to unseen domains in face of surging events like COVID-19. In this paper, we propose an approach that enables rapid domain adaptation from the perspective of unsupervised translation.
arXiv Detail & Related papers (2020-10-23T20:31:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.