Efficient Cluster-Based k-Nearest-Neighbor Machine Translation
- URL: http://arxiv.org/abs/2204.06175v1
- Date: Wed, 13 Apr 2022 05:46:31 GMT
- Title: Efficient Cluster-Based k-Nearest-Neighbor Machine Translation
- Authors: Dexin Wang, Kai Fan, Boxing Chen and Deyi Xiong
- Abstract summary: k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT)
- Score: 65.69742565855395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as
a non-parametric solution for domain adaptation in neural machine translation
(NMT). It aims to alleviate the performance degradation of advanced MT systems
in translating out-of-domain sentences by coordinating with an additional
token-level feature-based retrieval module constructed from in-domain data.
Previous studies have already demonstrated that non-parametric NMT is even
superior to models fine-tuned on out-of-domain data. In spite of this success,
kNN retrieval is at the expense of high latency, in particular for large
datastores. To make it practical, in this paper, we explore a more efficient
kNN-MT and propose to use clustering to improve the retrieval efficiency.
Concretely, we first propose a cluster-based Compact Network for feature
reduction in a contrastive learning manner to compress context features into
90+% lower dimensional vectors. We then suggest a cluster-based pruning
solution to filter out 10%-40% redundant nodes in large datastores while
retaining translation quality. Our proposed methods achieve better or
comparable performance while reducing up to 57% inference latency against the
advanced non-parametric MT model on several machine translation benchmarks.
Experimental results indicate that the proposed methods maintain the most
useful information of the original datastore and the Compact Network shows good
generalization on unseen domains.
Related papers
- Towards Faster k-Nearest-Neighbor Machine Translation [56.66038663128903]
k-nearest-neighbor machine translation approaches suffer from heavy retrieve overhead on the entire datastore when decoding each token.
We propose a simple yet effective multi-layer perceptron (MLP) network to predict whether a token should be translated jointly by the neural machine translation model and probabilities produced by the kNN.
arXiv Detail & Related papers (2023-12-12T16:41:29Z) - Bridging the Domain Gaps in Context Representations for k-Nearest
Neighbor Neural Machine Translation [57.49095610777317]
$k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains.
We propose a novel approach to boost the datastore retrieval of $k$NN-MT by reconstructing the original datastore.
Our method can effectively boost the datastore retrieval and translation quality of $k$NN-MT.
arXiv Detail & Related papers (2023-05-26T03:04:42Z) - Nearest Neighbor Machine Translation is Meta-Optimizer on Output
Projection Layer [44.02848852485475]
Nearest Neighbor Machine Translation ($k$NN-MT) has achieved great success in domain adaptation tasks.
We comprehensively analyze $k$NN-MT through theoretical and empirical studies.
arXiv Detail & Related papers (2023-05-22T13:38:53Z) - Simple and Scalable Nearest Neighbor Machine Translation [11.996135740547897]
$k$NN-MT is a powerful approach for fast domain adaptation.
We propose a simple and scalable nearest neighbor machine translation framework.
Our proposed approach achieves almost 90% speed as the NMT model without performance degradation.
arXiv Detail & Related papers (2023-02-23T17:28:29Z) - Towards Robust k-Nearest-Neighbor Machine Translation [72.9252395037097]
k-Nearest-Neighbor Machine Translation (kNN-MT) becomes an important research direction of NMT in recent years.
Its main idea is to retrieve useful key-value pairs from an additional datastore to modify translations without updating the NMT model.
The underlying retrieved noisy pairs will dramatically deteriorate the model performance.
We propose a confidence-enhanced kNN-MT model with robust training to alleviate the impact of noise.
arXiv Detail & Related papers (2022-10-17T07:43:39Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval.
Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token.
We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z) - Variational Auto Encoder Gradient Clustering [0.0]
Clustering using deep neural network models have been extensively studied in recent years.
This article investigates how probability function gradient ascent can be used to process data in order to achieve better clustering.
We propose a simple yet effective method for investigating suitable number of clusters for data, based on the DBSCAN clustering algorithm.
arXiv Detail & Related papers (2021-05-11T08:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.