Towards Faster k-Nearest-Neighbor Machine Translation
- URL: http://arxiv.org/abs/2312.07419v1
- Date: Tue, 12 Dec 2023 16:41:29 GMT
- Title: Towards Faster k-Nearest-Neighbor Machine Translation
- Authors: Xiangyu Shi, Yunlong Liang, Jinan Xu, Yufeng Chen
- Abstract summary: k-nearest-neighbor machine translation approaches suffer from heavy retrieve overhead on the entire datastore when decoding each token.
We propose a simple yet effective multi-layer perceptron (MLP) network to predict whether a token should be translated jointly by the neural machine translation model and probabilities produced by the kNN.
- Score: 56.66038663128903
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent works have proven the effectiveness of k-nearest-neighbor machine
translation(a.k.a kNN-MT) approaches to produce remarkable improvement in
cross-domain translations. However, these models suffer from heavy retrieve
overhead on the entire datastore when decoding each token. We observe that
during the decoding phase, about 67% to 84% of tokens are unvaried after
searching over the corpus datastore, which means most of the tokens cause
futile retrievals and introduce unnecessary computational costs by initiating
k-nearest-neighbor searches. We consider this phenomenon is explainable in
linguistics and propose a simple yet effective multi-layer perceptron (MLP)
network to predict whether a token should be translated jointly by the neural
machine translation model and probabilities produced by the kNN or just by the
neural model. The results show that our method succeeds in reducing redundant
retrieval operations and significantly reduces the overhead of kNN retrievals
by up to 53% at the expense of a slight decline in translation quality.
Moreover, our method could work together with all existing kNN-MT systems.
Related papers
- Bridging the Domain Gaps in Context Representations for k-Nearest
Neighbor Neural Machine Translation [57.49095610777317]
$k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains.
We propose a novel approach to boost the datastore retrieval of $k$NN-MT by reconstructing the original datastore.
Our method can effectively boost the datastore retrieval and translation quality of $k$NN-MT.
arXiv Detail & Related papers (2023-05-26T03:04:42Z) - Simple and Scalable Nearest Neighbor Machine Translation [11.996135740547897]
$k$NN-MT is a powerful approach for fast domain adaptation.
We propose a simple and scalable nearest neighbor machine translation framework.
Our proposed approach achieves almost 90% speed as the NMT model without performance degradation.
arXiv Detail & Related papers (2023-02-23T17:28:29Z) - Learning Decoupled Retrieval Representation for Nearest Neighbour Neural
Machine Translation [16.558519886325623]
kNN-MT successfully incorporates external corpus by retrieving word-level representations at test time.
In this work, we highlight that coupling the representations of these two tasks is sub-optimal for fine-grained retrieval.
We leverage supervised contrastive learning to learn the distinctive retrieval representation derived from the original context representation.
arXiv Detail & Related papers (2022-09-19T03:19:38Z) - Nearest Neighbor Zero-Shot Inference [68.56747574377215]
kNN-Prompt is a technique to use k-nearest neighbor (kNN) retrieval augmentation for zero-shot inference with language models (LMs)
fuzzy verbalizers leverage the sparse kNN distribution for downstream tasks by automatically associating each classification label with a set of natural language tokens.
Experiments show that kNN-Prompt is effective for domain adaptation with no further training, and that the benefits of retrieval increase with the size of the model used for kNN retrieval.
arXiv Detail & Related papers (2022-05-27T07:00:59Z) - Efficient Cluster-Based k-Nearest-Neighbor Machine Translation [65.69742565855395]
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT)
arXiv Detail & Related papers (2022-04-13T05:46:31Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval.
Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token.
We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.