Chunk-based Nearest Neighbor Machine Translation
- URL: http://arxiv.org/abs/2205.12230v1
- Date: Tue, 24 May 2022 17:39:25 GMT
- Title: Chunk-based Nearest Neighbor Machine Translation
- Authors: Pedro Henrique Martins and Zita Marinho and Andr\'e F. T. Martins
- Abstract summary: We introduce a textitchunk-based $k$NN-MT model which retrieves chunks of tokens from the datastore, instead of a single token.
Experiments on machine translation in two settings, static domain adaptation and on-the-fly'' adaptation, show that the chunk-based model leads to a significant speed-up (up to 4 times) with only a small drop in translation quality.
- Score: 7.747003493657217
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semi-parametric models, which augment generation with retrieval, have led to
impressive results in language modeling and machine translation, due to their
ability to leverage information retrieved from a datastore of examples. One of
the most prominent approaches, $k$NN-MT, has an outstanding performance on
domain adaptation by retrieving tokens from a domain-specific datastore
\citep{khandelwal2020nearest}. However, $k$NN-MT requires retrieval for every
single generated token, leading to a very low decoding speed (around 8 times
slower than a parametric model). In this paper, we introduce a
\textit{chunk-based} $k$NN-MT model which retrieves chunks of tokens from the
datastore, instead of a single token. We propose several strategies for
incorporating the retrieved chunks into the generation process, and for
selecting the steps at which the model needs to search for neighbors in the
datastore. Experiments on machine translation in two settings, static domain
adaptation and ``on-the-fly'' adaptation, show that the chunk-based $k$NN-MT
model leads to a significant speed-up (up to 4 times) with only a small drop in
translation quality.
Related papers
- Simple and Scalable Nearest Neighbor Machine Translation [11.996135740547897]
$k$NN-MT is a powerful approach for fast domain adaptation.
We propose a simple and scalable nearest neighbor machine translation framework.
Our proposed approach achieves almost 90% speed as the NMT model without performance degradation.
arXiv Detail & Related papers (2023-02-23T17:28:29Z) - N-Gram Nearest Neighbor Machine Translation [101.25243884801183]
We propose a novel $n$-gram nearest neighbor retrieval method that is model agnostic and applicable to both Autoregressive Translation(AT) and Non-Autoregressive Translation(NAT) models.
We demonstrate that the proposed method consistently outperforms the token-level method on both AT and NAT models as well as on general as on domain adaptation translation tasks.
arXiv Detail & Related papers (2023-01-30T13:19:19Z) - Better Datastore, Better Translation: Generating Datastores from
Pre-Trained Models for Nearest Neural Machine Translation [48.58899349349702]
Nearest Neighbor Machine Translation (kNNMT) is a simple and effective method of augmenting neural machine translation (NMT) with a token-level nearest neighbor retrieval mechanism.
In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT.
arXiv Detail & Related papers (2022-12-17T08:34:20Z) - Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval [129.25914272977542]
RetoMaton is a weighted finite automaton built on top of the datastore.
Traversing this automaton at inference time, in parallel to the LM inference, reduces its perplexity.
arXiv Detail & Related papers (2022-01-28T21:38:56Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Nearest Neighbor Machine Translation [113.96357168879548]
We introduce $k$-nearest-neighbor machine translation ($k$NN-MT)
It predicts tokens with a nearest neighbor classifier over a large datastore of cached examples.
It consistently improves performance across many settings.
arXiv Detail & Related papers (2020-10-01T22:24:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.