Adaptive Nearest Neighbor Machine Translation
- URL: http://arxiv.org/abs/2105.13022v1
- Date: Thu, 27 May 2021 09:27:42 GMT
- Title: Adaptive Nearest Neighbor Machine Translation
- Authors: Xin Zheng, Zhirui Zhang, Junliang Guo, Shujian Huang, Boxing Chen,
Weihua Luo and Jiajun Chen
- Abstract summary: kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval.
Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token.
We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
- Score: 60.97183408140499
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: kNN-MT, recently proposed by Khandelwal et al. (2020a), successfully combines
pre-trained neural machine translation (NMT) model with token-level
k-nearest-neighbor (kNN) retrieval to improve the translation accuracy.
However, the traditional kNN algorithm used in kNN-MT simply retrieves a same
number of nearest neighbors for each target token, which may cause prediction
errors when the retrieved neighbors include noises. In this paper, we propose
Adaptive kNN-MT to dynamically determine the number of k for each target token.
We achieve this by introducing a light-weight Meta-k Network, which can be
efficiently trained with only a few training samples. On four benchmark machine
translation datasets, we demonstrate that the proposed method is able to
effectively filter out the noises in retrieval results and significantly
outperforms the vanilla kNN-MT model. Even more noteworthy is that the Meta-k
Network learned on one domain could be directly applied to other domains and
obtain consistent improvements, illustrating the generality of our method. Our
implementation is open-sourced at https://github.com/zhengxxn/adaptive-knn-mt.
Related papers
- Simply Trainable Nearest Neighbour Machine Translation with GPU Inference [2.3420045370973828]
This paper proposes a trainable nearest neighbor machine translation on GPU.
We first adaptively construct a small datastore for each input sentence.
Second, we train a single-layer network for the adaption between the knnMT and pre-trained result to automatically interpolate in different domains.
arXiv Detail & Related papers (2024-07-29T12:55:40Z) - Towards Faster k-Nearest-Neighbor Machine Translation [56.66038663128903]
k-nearest-neighbor machine translation approaches suffer from heavy retrieve overhead on the entire datastore when decoding each token.
We propose a simple yet effective multi-layer perceptron (MLP) network to predict whether a token should be translated jointly by the neural machine translation model and probabilities produced by the kNN.
arXiv Detail & Related papers (2023-12-12T16:41:29Z) - INK: Injecting kNN Knowledge in Nearest Neighbor Machine Translation [57.952478914459164]
kNN-MT has provided an effective paradigm to smooth the prediction based on neighbor representations during inference.
We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters.
Experiments on four benchmark datasets show that method achieves average gains of 1.99 COMET and 1.0 BLEU, outperforming the state-of-the-art kNN-MT system with 0.02x memory space and 1.9x inference speedup.
arXiv Detail & Related papers (2023-06-10T08:39:16Z) - Simple and Scalable Nearest Neighbor Machine Translation [11.996135740547897]
$k$NN-MT is a powerful approach for fast domain adaptation.
We propose a simple and scalable nearest neighbor machine translation framework.
Our proposed approach achieves almost 90% speed as the NMT model without performance degradation.
arXiv Detail & Related papers (2023-02-23T17:28:29Z) - Towards Robust k-Nearest-Neighbor Machine Translation [72.9252395037097]
k-Nearest-Neighbor Machine Translation (kNN-MT) becomes an important research direction of NMT in recent years.
Its main idea is to retrieve useful key-value pairs from an additional datastore to modify translations without updating the NMT model.
The underlying retrieved noisy pairs will dramatically deteriorate the model performance.
We propose a confidence-enhanced kNN-MT model with robust training to alleviate the impact of noise.
arXiv Detail & Related papers (2022-10-17T07:43:39Z) - Nearest Neighbor Zero-Shot Inference [68.56747574377215]
kNN-Prompt is a technique to use k-nearest neighbor (kNN) retrieval augmentation for zero-shot inference with language models (LMs)
fuzzy verbalizers leverage the sparse kNN distribution for downstream tasks by automatically associating each classification label with a set of natural language tokens.
Experiments show that kNN-Prompt is effective for domain adaptation with no further training, and that the benefits of retrieval increase with the size of the model used for kNN retrieval.
arXiv Detail & Related papers (2022-05-27T07:00:59Z) - DNNR: Differential Nearest Neighbors Regression [8.667550264279166]
K-nearest neighbors (KNN) is one of the earliest and most established algorithms in machine learning.
For regression tasks, KNN averages the targets within a neighborhood which poses a number of challenges.
We propose Differential Nearest Neighbors Regression (DNNR) that addresses both issues simultaneously.
arXiv Detail & Related papers (2022-05-17T15:22:53Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - KNN Classification with One-step Computation [10.381276986079865]
A one-step computation is proposed to replace the lazy part of KNN classification.
The proposed approach is experimentally evaluated, and demonstrated that the one-step KNN classification is efficient and promising.
arXiv Detail & Related papers (2020-12-09T13:34:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.