Angular Dispersion Accelerates $k$-Nearest Neighbors Machine Translation
- URL: http://arxiv.org/abs/2509.16729v1
- Date: Sat, 20 Sep 2025 15:48:22 GMT
- Title: Angular Dispersion Accelerates $k$-Nearest Neighbors Machine Translation
- Authors: Evgeniia Tokarchuk, Sergey Troshin, Vlad Niculae,
- Abstract summary: Augmenting neural machine translation with external memory at decoding time is a well-established strategy for increasing translation performance.<n>$k$-NN MT retrieves a set of tokens that occurred in the most similar contexts recorded in a prepared data store.<n>Since an exhaustive search is not feasible in large data stores, practitioners commonly use approximate $k$-NN MT lookup.<n>We show that improving dispersion leads to better balance in the retrieval data structures, accelerating retrieval and slightly improving translations.
- Score: 12.155276471887559
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Augmenting neural machine translation with external memory at decoding time, in the form of k-nearest neighbors machine translation ($k$-NN MT), is a well-established strategy for increasing translation performance. $k$-NN MT retrieves a set of tokens that occurred in the most similar contexts recorded in a prepared data store, using hidden state representations of translation contexts as vector lookup keys. One of the main disadvantages of this method is the high computational cost and memory requirements. Since an exhaustive search is not feasible in large data stores, practitioners commonly use approximate $k$-NN MT lookup, yet even such algorithms are a bottleneck. In contrast to research directions seeking to accelerate $k$-NN MT by reducing data store size or the number of lookup calls, we pursue an orthogonal direction based on the performance properties of approximate $k$-NN MT lookup data structures. In particular, we propose to encourage angular dispersion of the neural hidden representations of contexts. We show that improving dispersion leads to better balance in the retrieval data structures, accelerating retrieval and slightly improving translations.
Related papers
- Towards Faster k-Nearest-Neighbor Machine Translation [51.866464707284635]
k-nearest-neighbor machine translation approaches suffer from heavy retrieve overhead on the entire datastore when decoding each token.<n>We propose a simple yet effective multi-layer perceptron (MLP) network to predict whether a token should be translated jointly by the neural machine translation model and probabilities produced by the kNN.<n>Our method significantly reduces the overhead of kNN retrievals by up to 53% at the expense of a slight decline in translation quality.
arXiv Detail & Related papers (2023-12-12T16:41:29Z) - Bridging the Domain Gaps in Context Representations for k-Nearest
Neighbor Neural Machine Translation [57.49095610777317]
$k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains.
We propose a novel approach to boost the datastore retrieval of $k$NN-MT by reconstructing the original datastore.
Our method can effectively boost the datastore retrieval and translation quality of $k$NN-MT.
arXiv Detail & Related papers (2023-05-26T03:04:42Z) - Simple and Scalable Nearest Neighbor Machine Translation [11.996135740547897]
$k$NN-MT is a powerful approach for fast domain adaptation.
We propose a simple and scalable nearest neighbor machine translation framework.
Our proposed approach achieves almost 90% speed as the NMT model without performance degradation.
arXiv Detail & Related papers (2023-02-23T17:28:29Z) - Better Datastore, Better Translation: Generating Datastores from
Pre-Trained Models for Nearest Neural Machine Translation [48.58899349349702]
Nearest Neighbor Machine Translation (kNNMT) is a simple and effective method of augmenting neural machine translation (NMT) with a token-level nearest neighbor retrieval mechanism.
In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT.
arXiv Detail & Related papers (2022-12-17T08:34:20Z) - Chunk-based Nearest Neighbor Machine Translation [7.747003493657217]
We introduce a textitchunk-based $k$NN-MT model which retrieves chunks of tokens from the datastore, instead of a single token.
Experiments on machine translation in two settings, static domain adaptation and on-the-fly'' adaptation, show that the chunk-based model leads to a significant speed-up (up to 4 times) with only a small drop in translation quality.
arXiv Detail & Related papers (2022-05-24T17:39:25Z) - Efficient Cluster-Based k-Nearest-Neighbor Machine Translation [65.69742565855395]
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT)
arXiv Detail & Related papers (2022-04-13T05:46:31Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Nearest Neighbor Machine Translation [113.96357168879548]
We introduce $k$-nearest-neighbor machine translation ($k$NN-MT)
It predicts tokens with a nearest neighbor classifier over a large datastore of cached examples.
It consistently improves performance across many settings.
arXiv Detail & Related papers (2020-10-01T22:24:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.