Towards Robust k-Nearest-Neighbor Machine Translation
- URL: http://arxiv.org/abs/2210.08808v1
- Date: Mon, 17 Oct 2022 07:43:39 GMT
- Title: Towards Robust k-Nearest-Neighbor Machine Translation
- Authors: Hui Jiang, Ziyao Lu, Fandong Meng, Chulun Zhou, Jie Zhou, Degen Huang
and Jinsong Su
- Abstract summary: k-Nearest-Neighbor Machine Translation (kNN-MT) becomes an important research direction of NMT in recent years.
Its main idea is to retrieve useful key-value pairs from an additional datastore to modify translations without updating the NMT model.
The underlying retrieved noisy pairs will dramatically deteriorate the model performance.
We propose a confidence-enhanced kNN-MT model with robust training to alleviate the impact of noise.
- Score: 72.9252395037097
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: k-Nearest-Neighbor Machine Translation (kNN-MT) becomes an important research
direction of NMT in recent years. Its main idea is to retrieve useful key-value
pairs from an additional datastore to modify translations without updating the
NMT model. However, the underlying retrieved noisy pairs will dramatically
deteriorate the model performance. In this paper, we conduct a preliminary
study and find that this problem results from not fully exploiting the
prediction of the NMT model. To alleviate the impact of noise, we propose a
confidence-enhanced kNN-MT model with robust training. Concretely, we introduce
the NMT confidence to refine the modeling of two important components of
kNN-MT: kNN distribution and the interpolation weight. Meanwhile we inject two
types of perturbations into the retrieved pairs for robust training.
Experimental results on four benchmark datasets demonstrate that our model not
only achieves significant improvements over current kNN-MT models, but also
exhibits better robustness. Our code is available at
https://github.com/DeepLearnXMU/Robust-knn-mt.
Related papers
- Better Datastore, Better Translation: Generating Datastores from
Pre-Trained Models for Nearest Neural Machine Translation [48.58899349349702]
Nearest Neighbor Machine Translation (kNNMT) is a simple and effective method of augmenting neural machine translation (NMT) with a token-level nearest neighbor retrieval mechanism.
In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT.
arXiv Detail & Related papers (2022-12-17T08:34:20Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Efficient Cluster-Based k-Nearest-Neighbor Machine Translation [65.69742565855395]
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT)
arXiv Detail & Related papers (2022-04-13T05:46:31Z) - Characterizing and Understanding the Behavior of Quantized Models for
Reliable Deployment [32.01355605506855]
Quantization-aware training can produce more stable models than standard, adversarial, and Mixup training.
Disagreements often have closer top-1 and top-2 output probabilities, and $Margin$ is a better indicator than the other uncertainty metrics to distinguish disagreements.
We opensource our code and models as a new benchmark for further studying the quantized models.
arXiv Detail & Related papers (2022-04-08T11:19:16Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval.
Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token.
We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.