Bridging the Domain Gaps in Context Representations for k-Nearest
Neighbor Neural Machine Translation
- URL: http://arxiv.org/abs/2305.16599v1
- Date: Fri, 26 May 2023 03:04:42 GMT
- Title: Bridging the Domain Gaps in Context Representations for k-Nearest
Neighbor Neural Machine Translation
- Authors: Zhiwei Cao, Baosong Yang, Huan Lin, Suhang Wu, Xiangpeng Wei, Dayiheng
Liu, Jun Xie, Min Zhang and Jinsong Su
- Abstract summary: $k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains.
We propose a novel approach to boost the datastore retrieval of $k$NN-MT by reconstructing the original datastore.
Our method can effectively boost the datastore retrieval and translation quality of $k$NN-MT.
- Score: 57.49095610777317
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: $k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing
attention due to its ability to non-parametrically adapt to new translation
domains. By using an upstream NMT model to traverse the downstream training
corpus, it is equipped with a datastore containing vectorized key-value pairs,
which are retrieved during inference to benefit translation. However, there
often exists a significant gap between upstream and downstream domains, which
hurts the retrieval accuracy and the final translation quality. To deal with
this issue, we propose a novel approach to boost the datastore retrieval of
$k$NN-MT by reconstructing the original datastore. Concretely, we design a
reviser to revise the key representations, making them better fit for the
downstream domain. The reviser is trained using the collected
semantically-related key-queries pairs, and optimized by two proposed losses:
one is the key-queries semantic distance ensuring each revised key
representation is semantically related to its corresponding queries, and the
other is an L2-norm loss encouraging revised key representations to effectively
retain the knowledge learned by the upstream NMT model. Extensive experiments
on domain adaptation tasks demonstrate that our method can effectively boost
the datastore retrieval and translation quality of $k$NN-MT.\footnote{Our code
is available at \url{https://github.com/DeepLearnXMU/RevisedKey-knn-mt}.}
Related papers
- Towards Faster k-Nearest-Neighbor Machine Translation [56.66038663128903]
k-nearest-neighbor machine translation approaches suffer from heavy retrieve overhead on the entire datastore when decoding each token.
We propose a simple yet effective multi-layer perceptron (MLP) network to predict whether a token should be translated jointly by the neural machine translation model and probabilities produced by the kNN.
arXiv Detail & Related papers (2023-12-12T16:41:29Z) - KNN-LM Does Not Improve Open-ended Text Generation [34.86733697757264]
We study the generation quality of retrieval-augmented language models (LMs)
We find that interpolating with a retrieval distribution actually increases perplexity compared to a baseline Transformer LM.
We discover that the entropy of the retrieval distribution increases faster than that of the base LM as the generated sequence becomes longer.
arXiv Detail & Related papers (2023-05-24T01:48:33Z) - Learning Decoupled Retrieval Representation for Nearest Neighbour Neural
Machine Translation [16.558519886325623]
kNN-MT successfully incorporates external corpus by retrieving word-level representations at test time.
In this work, we highlight that coupling the representations of these two tasks is sub-optimal for fine-grained retrieval.
We leverage supervised contrastive learning to learn the distinctive retrieval representation derived from the original context representation.
arXiv Detail & Related papers (2022-09-19T03:19:38Z) - Non-Parametric Domain Adaptation for End-to-End Speech Translation [72.37869362559212]
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.
We propose a novel non-parametric method that leverages domain-specific text translation corpus to achieve domain adaptation for the E2E-ST system.
arXiv Detail & Related papers (2022-05-23T11:41:02Z) - Efficient Cluster-Based k-Nearest-Neighbor Machine Translation [65.69742565855395]
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT)
arXiv Detail & Related papers (2022-04-13T05:46:31Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Iterative Domain-Repaired Back-Translation [50.32925322697343]
In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent.
We propose a novel iterative domain-repaired back-translation framework, which introduces the Domain-Repair model to refine translations in synthetic bilingual data.
Experiments on adapting NMT models between specific domains and from the general domain to specific domains demonstrate the effectiveness of our proposed approach.
arXiv Detail & Related papers (2020-10-06T04:38:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.