Nearest Neighbor Machine Translation
- URL: http://arxiv.org/abs/2010.00710v2
- Date: Thu, 22 Jul 2021 14:44:51 GMT
- Title: Nearest Neighbor Machine Translation
- Authors: Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike
Lewis
- Abstract summary: We introduce $k$-nearest-neighbor machine translation ($k$NN-MT)
It predicts tokens with a nearest neighbor classifier over a large datastore of cached examples.
It consistently improves performance across many settings.
- Score: 113.96357168879548
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce $k$-nearest-neighbor machine translation ($k$NN-MT), which
predicts tokens with a nearest neighbor classifier over a large datastore of
cached examples, using representations from a neural translation model for
similarity search. This approach requires no additional training and scales to
give the decoder direct access to billions of examples at test time, resulting
in a highly expressive model that consistently improves performance across many
settings. Simply adding nearest neighbor search improves a state-of-the-art
German-English translation model by 1.5 BLEU. $k$NN-MT allows a single model to
be adapted to diverse domains by using a domain-specific datastore, improving
results by an average of 9.2 BLEU over zero-shot transfer, and achieving new
state-of-the-art results -- without training on these domains. A massively
multilingual model can also be specialized for particular language pairs, with
improvements of 3 BLEU for translating from English into German and Chinese.
Qualitatively, $k$NN-MT is easily interpretable; it combines source and target
context to retrieve highly relevant examples.
Related papers
- Simple and Scalable Nearest Neighbor Machine Translation [11.996135740547897]
$k$NN-MT is a powerful approach for fast domain adaptation.
We propose a simple and scalable nearest neighbor machine translation framework.
Our proposed approach achieves almost 90% speed as the NMT model without performance degradation.
arXiv Detail & Related papers (2023-02-23T17:28:29Z) - Better Datastore, Better Translation: Generating Datastores from
Pre-Trained Models for Nearest Neural Machine Translation [48.58899349349702]
Nearest Neighbor Machine Translation (kNNMT) is a simple and effective method of augmenting neural machine translation (NMT) with a token-level nearest neighbor retrieval mechanism.
In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT.
arXiv Detail & Related papers (2022-12-17T08:34:20Z) - Chunk-based Nearest Neighbor Machine Translation [7.747003493657217]
We introduce a textitchunk-based $k$NN-MT model which retrieves chunks of tokens from the datastore, instead of a single token.
Experiments on machine translation in two settings, static domain adaptation and on-the-fly'' adaptation, show that the chunk-based model leads to a significant speed-up (up to 4 times) with only a small drop in translation quality.
arXiv Detail & Related papers (2022-05-24T17:39:25Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.