Learning Decoupled Retrieval Representation for Nearest Neighbour Neural
Machine Translation
- URL: http://arxiv.org/abs/2209.08738v3
- Date: Tue, 19 Sep 2023 04:25:32 GMT
- Title: Learning Decoupled Retrieval Representation for Nearest Neighbour Neural
Machine Translation
- Authors: Qiang Wang, Rongxiang Weng, Ming Chen
- Abstract summary: kNN-MT successfully incorporates external corpus by retrieving word-level representations at test time.
In this work, we highlight that coupling the representations of these two tasks is sub-optimal for fine-grained retrieval.
We leverage supervised contrastive learning to learn the distinctive retrieval representation derived from the original context representation.
- Score: 16.558519886325623
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: K-Nearest Neighbor Neural Machine Translation (kNN-MT) successfully
incorporates external corpus by retrieving word-level representations at test
time. Generally, kNN-MT borrows the off-the-shelf context representation in the
translation task, e.g., the output of the last decoder layer, as the query
vector of the retrieval task. In this work, we highlight that coupling the
representations of these two tasks is sub-optimal for fine-grained retrieval.
To alleviate it, we leverage supervised contrastive learning to learn the
distinctive retrieval representation derived from the original context
representation. We also propose a fast and effective approach to constructing
hard negative samples. Experimental results on five domains show that our
approach improves the retrieval accuracy and BLEU score compared to vanilla
kNN-MT.
Related papers
- Towards Faster k-Nearest-Neighbor Machine Translation [56.66038663128903]
k-nearest-neighbor machine translation approaches suffer from heavy retrieve overhead on the entire datastore when decoding each token.
We propose a simple yet effective multi-layer perceptron (MLP) network to predict whether a token should be translated jointly by the neural machine translation model and probabilities produced by the kNN.
arXiv Detail & Related papers (2023-12-12T16:41:29Z) - Bridging the Domain Gaps in Context Representations for k-Nearest
Neighbor Neural Machine Translation [57.49095610777317]
$k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains.
We propose a novel approach to boost the datastore retrieval of $k$NN-MT by reconstructing the original datastore.
Our method can effectively boost the datastore retrieval and translation quality of $k$NN-MT.
arXiv Detail & Related papers (2023-05-26T03:04:42Z) - Learning Homographic Disambiguation Representation for Neural Machine
Translation [20.242134720005467]
Homographs, words with the same spelling but different meanings, remain challenging in Neural Machine Translation (NMT)
We propose a novel approach to tackle issues of NMT in the latent space.
We first train an encoder (aka " homographic-encoder") to learn universal sentence representations in a natural language inference (NLI) task.
We further fine-tune the encoder using homograph-based syn-set WordNet, enabling it to learn word-set representations from sentences.
arXiv Detail & Related papers (2023-04-12T13:42:59Z) - Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning [25.230786853723203]
We propose a noise-robust cross-lingual cross-modal retrieval method for low-resource languages.
We use Machine Translation to construct pseudo-parallel sentence pairs for low-resource languages.
We introduce a multi-view self-distillation method to learn noise-robust target-language representations.
arXiv Detail & Related papers (2022-08-26T09:32:24Z) - Neural Implicit Dictionary via Mixture-of-Expert Training [111.08941206369508]
We present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID)
Our NID assembles a group of coordinate-based Impworks which are tuned to span the desired function space.
Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data.
arXiv Detail & Related papers (2022-07-08T05:07:19Z) - Efficient Cluster-Based k-Nearest-Neighbor Machine Translation [65.69742565855395]
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT)
arXiv Detail & Related papers (2022-04-13T05:46:31Z) - Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning [15.729812221628382]
We introduce a simple neural encoder architecture that can be trained using an unsupervised contrastive learning objective.
We show that when built on top of recent self-supervised audio representations, this method can be applied iteratively and yield competitive SSE.
arXiv Detail & Related papers (2022-04-11T14:28:01Z) - Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval.
Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token.
We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.