Decoding Time Lexical Domain Adaptationfor Neural Machine Translation
- URL: http://arxiv.org/abs/2101.00421v1
- Date: Sat, 2 Jan 2021 11:06:15 GMT
- Title: Decoding Time Lexical Domain Adaptationfor Neural Machine Translation
- Authors: Nikolay Bogoychev and Pinzhen Chen
- Abstract summary: Machine translation systems are vulnerable to domain mismatch, especially when the task is low-resource.
We present two simple methods for improving translation quality in this particular setting.
- Score: 7.628949147902029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine translation systems are vulnerable to domain mismatch, especially
when the task is low-resource. In this setting, out of domain translations are
often of poor quality and prone to hallucinations, due to the translation model
preferring to predict common words it has seen during training, as opposed to
the more uncommon ones from a different domain. We present two simple methods
for improving translation quality in this particular setting: First, we use
lexical shortlisting in order to restrict the neural network predictions by IBM
model computed alignments. Second, we perform $n$-best list reordering by
reranking all translations based on the amount they overlap with each other.
Our methods are computationally simpler and faster than alternative approaches,
and show a moderate success on low-resource settings with explicit out of
domain test sets. However, our methods lose their effectiveness when the domain
mismatch is too great, or in high resource setting.
Related papers
- Plug, Play, and Fuse: Zero-Shot Joint Decoding via Word-Level Re-ranking Across Diverse Vocabularies [12.843274390224853]
Real-world tasks, like multimodal translation, often require a combination of these strengths, such as handling both translation and image processing.
We propose a novel zero-shot ensembling strategy that allows for the integration of different models during the decoding phase without the need for additional training.
Our approach re-ranks beams during decoding by combining scores at the word level, using multimodals to predict when a word is completed.
arXiv Detail & Related papers (2024-08-21T04:20:55Z) - Non-Parametric Domain Adaptation for End-to-End Speech Translation [72.37869362559212]
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.
We propose a novel non-parametric method that leverages domain-specific text translation corpus to achieve domain adaptation for the E2E-ST system.
arXiv Detail & Related papers (2022-05-23T11:41:02Z) - DEEP: DEnoising Entity Pre-training for Neural Machine Translation [123.6686940355937]
It has been shown that machine translation models usually generate poor translations for named entities that are infrequent in the training corpus.
We propose DEEP, a DEnoising Entity Pre-training method that leverages large amounts of monolingual data and a knowledge base to improve named entity translation accuracy within sentences.
arXiv Detail & Related papers (2021-11-14T17:28:09Z) - BitextEdit: Automatic Bitext Editing for Improved Low-Resource Machine
Translation [53.55009917938002]
We propose to refine the mined bitexts via automatic editing.
Experiments demonstrate that our approach successfully improves the quality of CCMatrix mined bitext for 5 low-resource language-pairs and 10 translation directions by up to 8 BLEU points.
arXiv Detail & Related papers (2021-11-12T16:00:39Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - Phrase-level Active Learning for Neural Machine Translation [107.28450614074002]
We propose an active learning setting where we can spend a given budget on translating in-domain data.
We select both full sentences and individual phrases from unlabelled data in the new domain for routing to human translators.
In a German-English translation task, our active learning approach achieves consistent improvements over uncertainty-based sentence selection methods.
arXiv Detail & Related papers (2021-06-21T19:20:42Z) - Improving Lexically Constrained Neural Machine Translation with
Source-Conditioned Masked Span Prediction [6.46964825569749]
In this paper, we tackle a more challenging setup consisting of domain-specific corpora with much longer n-gram and highly specialized terms.
To encourage span-level representations in generation, we additionally impose a source-sentence conditioned masked span prediction loss in the decoder.
Experimental results on three domain-specific corpora in two language pairs demonstrate that the proposed training scheme can improve the performance of existing lexically constrained methods.
arXiv Detail & Related papers (2021-05-12T08:11:33Z) - Rapid Domain Adaptation for Machine Translation with Monolingual Data [31.70276147485463]
One challenge of machine translation is how to quickly adapt to unseen domains in face of surging events like COVID-19.
In this paper, we propose an approach that enables rapid domain adaptation from the perspective of unsupervised translation.
arXiv Detail & Related papers (2020-10-23T20:31:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.