In-context Examples Selection for Machine Translation
- URL: http://arxiv.org/abs/2212.02437v1
- Date: Mon, 5 Dec 2022 17:25:15 GMT
- Title: In-context Examples Selection for Machine Translation
- Authors: Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan
Ghazvininejad
- Abstract summary: Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning.
For Machine Translation (MT), these examples are typically randomly sampled from the development dataset with a similar distribution as the evaluation set.
We show that the translation quality and the domain of the in-context examples matter and that 1-shot noisy unrelated example can have a catastrophic impact on output quality.
- Score: 101.50473468507697
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale generative models show an impressive ability to perform a wide
range of Natural Language Processing (NLP) tasks using in-context learning,
where a few examples are used to describe a task to the model. For Machine
Translation (MT), these examples are typically randomly sampled from the
development dataset with a similar distribution as the evaluation set. However,
it is unclear how the choice of these in-context examples and their ordering
impacts the output translation quality. In this work, we aim to understand the
properties of good in-context examples for MT in both in-domain and
out-of-domain settings. We show that the translation quality and the domain of
the in-context examples matter and that 1-shot noisy unrelated example can have
a catastrophic impact on output quality. While concatenating multiple random
examples reduces the effect of noise, a single good prompt optimized to
maximize translation quality on the development dataset can elicit learned
information from the pre-trained language model. Adding similar examples based
on an n-gram overlap with the test source significantly and consistently
improves the translation quality of the outputs, outperforming a strong kNN-MT
baseline in 2 out of 4 out-of-domain datasets.
Related papers
- Effective Self-Mining of In-Context Examples for Unsupervised Machine Translation with LLMs [16.98133269527045]
We propose an unsupervised approach to mine in-context examples for machine translation (MT)
We introduce a filtering criterion to select the optimal in-context examples from a pool of unsupervised parallel sentences.
Our findings demonstrate the effectiveness of our unsupervised approach in mining in-context examples for MT.
arXiv Detail & Related papers (2024-10-14T18:47:04Z) - Context-Aware Machine Translation with Source Coreference Explanation [26.336947440529713]
We propose a model that explains the decisions made for translation by predicting coreference features in the input.
We evaluate our method in the WMT document-level translation task of English-German dataset, the English-Russian dataset, and the multilingual TED talk dataset.
arXiv Detail & Related papers (2024-04-30T12:41:00Z) - CTQScorer: Combining Multiple Features for In-context Example Selection
for Machine Translation [22.700587969696933]
We learn a regression model, CTQ Scorer, that selects examples based on multiple features in order to maximize the translation quality.
On multiple language pairs and language models, we show that CTQ Scorer helps significantly outperform random selection.
We also see an improvement of over 2.5 COMET points on average with respect to a strong BM25 retrieval-based baseline.
arXiv Detail & Related papers (2023-05-23T14:26:17Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Finding Support Examples for In-Context Learning [73.90376920653507]
We propose LENS, a fiLter-thEN-Search method to tackle this challenge in two stages.
First we filter the dataset to obtain informative in-context examples individually.
Then we propose diversity-guided example search which iteratively refines and evaluates the selected example permutations.
arXiv Detail & Related papers (2023-02-27T06:32:45Z) - Analyzing the Use of Influence Functions for Instance-Specific Data
Filtering in Neural Machine Translation [2.990760778216954]
Influence functions (IF) have been shown to be effective in finding relevant training examples for classification tasks.
We propose two effective extensions to a state of the art influence function and demonstrate on the sub-problem of copied training examples.
arXiv Detail & Related papers (2022-10-24T14:22:20Z) - Domain-Specific Text Generation for Machine Translation [7.803471587734353]
We propose a novel approach to domain adaptation leveraging state-of-the-art pretrained language models (LMs) for domain-specific data augmentation.
We employ mixed fine-tuning to train models that significantly improve translation of in-domain texts.
arXiv Detail & Related papers (2022-08-11T16:22:16Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z) - Informative Sample Mining Network for Multi-Domain Image-to-Image
Translation [101.01649070998532]
We show that improving the sample selection strategy is an effective solution for image-to-image translation tasks.
We propose a novel multi-stage sample training scheme to reduce sample hardness while preserving sample informativeness.
arXiv Detail & Related papers (2020-01-05T05:48:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.