Related papers: Bridging the Domain Gaps in Context Representations for k-Nearest Neighbor Neural Machine Translation

Bridging the Domain Gaps in Context Representations for k-Nearest Neighbor Neural Machine Translation

URL: http://arxiv.org/abs/2305.16599v1
Date: Fri, 26 May 2023 03:04:42 GMT
Title: Bridging the Domain Gaps in Context Representations for k-Nearest Neighbor Neural Machine Translation
Authors: Zhiwei Cao, Baosong Yang, Huan Lin, Suhang Wu, Xiangpeng Wei, Dayiheng Liu, Jun Xie, Min Zhang and Jinsong Su
Abstract summary: $k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains. We propose a novel approach to boost the datastore retrieval of $k$NN-MT by reconstructing the original datastore. Our method can effectively boost the datastore retrieval and translation quality of $k$NN-MT.
Score: 57.49095610777317
License: http://creativecommons.org/licenses/by/4.0/
Abstract: $k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains. By using an upstream NMT model to traverse the downstream training corpus, it is equipped with a datastore containing vectorized key-value pairs, which are retrieved during inference to benefit translation. However, there often exists a significant gap between upstream and downstream domains, which hurts the retrieval accuracy and the final translation quality. To deal with this issue, we propose a novel approach to boost the datastore retrieval of $k$NN-MT by reconstructing the original datastore. Concretely, we design a reviser to revise the key representations, making them better fit for the downstream domain. The reviser is trained using the collected semantically-related key-queries pairs, and optimized by two proposed losses: one is the key-queries semantic distance ensuring each revised key representation is semantically related to its corresponding queries, and the other is an L2-norm loss encouraging revised key representations to effectively retain the knowledge learned by the upstream NMT model. Extensive experiments on domain adaptation tasks demonstrate that our method can effectively boost the datastore retrieval and translation quality of $k$NN-MT.\footnote{Our code is available at \url{https://github.com/DeepLearnXMU/RevisedKey-knn-mt}.}

Related papers

Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training [53.07879717463279]
textscDomain2Vec decomposes any dataset into a linear combination of several emphmeta-domains<n>textscDomain2Vec helps find the data mixture that enhances downstream task performance with minimal computational overhead.
arXiv Detail & Related papers (2025-06-12T17:53:51Z)
Towards Faster k-Nearest-Neighbor Machine Translation [56.66038663128903]
k-nearest-neighbor machine translation approaches suffer from heavy retrieve overhead on the entire datastore when decoding each token. We propose a simple yet effective multi-layer perceptron (MLP) network to predict whether a token should be translated jointly by the neural machine translation model and probabilities produced by the kNN.
arXiv Detail & Related papers (2023-12-12T16:41:29Z)
KNN-LM Does Not Improve Open-ended Text Generation [34.86733697757264]
We study the generation quality of retrieval-augmented language models (LMs) We find that interpolating with a retrieval distribution actually increases perplexity compared to a baseline Transformer LM. We discover that the entropy of the retrieval distribution increases faster than that of the base LM as the generated sequence becomes longer.
arXiv Detail & Related papers (2023-05-24T01:48:33Z)
Learning Decoupled Retrieval Representation for Nearest Neighbour Neural Machine Translation [16.558519886325623]
kNN-MT successfully incorporates external corpus by retrieving word-level representations at test time. In this work, we highlight that coupling the representations of these two tasks is sub-optimal for fine-grained retrieval. We leverage supervised contrastive learning to learn the distinctive retrieval representation derived from the original context representation.
arXiv Detail & Related papers (2022-09-19T03:19:38Z)
Non-Parametric Domain Adaptation for End-to-End Speech Translation [72.37869362559212]
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters. We propose a novel non-parametric method that leverages domain-specific text translation corpus to achieve domain adaptation for the E2E-ST system.
arXiv Detail & Related papers (2022-05-23T11:41:02Z)
Efficient Cluster-Based k-Nearest-Neighbor Machine Translation [65.69742565855395]
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT)
arXiv Detail & Related papers (2022-04-13T05:46:31Z)
Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval. We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z)
Iterative Domain-Repaired Back-Translation [50.32925322697343]
In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent. We propose a novel iterative domain-repaired back-translation framework, which introduces the Domain-Repair model to refine translations in synthetic bilingual data. Experiments on adapting NMT models between specific domains and from the general domain to specific domains demonstrate the effectiveness of our proposed approach.
arXiv Detail & Related papers (2020-10-06T04:38:09Z)
A Simple Baseline to Semi-Supervised Domain Adaptation for Machine Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data. We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT. This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.