Related papers: N-Gram Nearest Neighbor Machine Translation

N-Gram Nearest Neighbor Machine Translation

URL: http://arxiv.org/abs/2301.12866v1
Date: Mon, 30 Jan 2023 13:19:19 GMT
Title: N-Gram Nearest Neighbor Machine Translation
Authors: Rui Lv, Junliang Guo, Rui Wang, Xu Tan, Qi Liu, Tao Qin
Abstract summary: We propose a novel $n$-gram nearest neighbor retrieval method that is model agnostic and applicable to both Autoregressive Translation(AT) and Non-Autoregressive Translation(NAT) models. We demonstrate that the proposed method consistently outperforms the token-level method on both AT and NAT models as well as on general as on domain adaptation translation tasks.
Score: 101.25243884801183
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Nearest neighbor machine translation augments the Autoregressive Translation~(AT) with $k$-nearest-neighbor retrieval, by comparing the similarity between the token-level context representations of the target tokens in the query and the datastore. However, the token-level representation may introduce noise when translating ambiguous words, or fail to provide accurate retrieval results when the representation generated by the model contains indistinguishable context information, e.g., Non-Autoregressive Translation~(NAT) models. In this paper, we propose a novel $n$-gram nearest neighbor retrieval method that is model agnostic and applicable to both AT and NAT models. Specifically, we concatenate the adjacent $n$-gram hidden representations as the key, while the tuple of corresponding target tokens is the value. In inference, we propose tailored decoding algorithms for AT and NAT models respectively. We demonstrate that the proposed method consistently outperforms the token-level method on both AT and NAT models as well on general as on domain adaptation translation tasks. On domain adaptation, the proposed method brings $1.03$ and $2.76$ improvements regarding the average BLEU score on AT and NAT models respectively.

Related papers

Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive. LCD can distort the global distribution over strings, sampling tokens based only on local information. We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z)
Chunk-based Nearest Neighbor Machine Translation [7.747003493657217]
We introduce a textitchunk-based $k$NN-MT model which retrieves chunks of tokens from the datastore, instead of a single token. Experiments on machine translation in two settings, static domain adaptation and on-the-fly'' adaptation, show that the chunk-based model leads to a significant speed-up (up to 4 times) with only a small drop in translation quality.
arXiv Detail & Related papers (2022-05-24T17:39:25Z)
Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval. We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z)
MvSR-NAT: Multi-view Subset Regularization for Non-Autoregressive Machine Translation [0.5586191108738562]
Conditional masked language models (CMLM) have shown impressive progress in non-autoregressive machine translation (NAT) We introduce Multi-view Subset Regularization (MvSR), a novel regularization method to improve the performance of the NAT model. We achieve remarkable performance on three public benchmarks with 0.36-1.14 BLEU gains over previous NAT models.
arXiv Detail & Related papers (2021-08-19T02:30:38Z)
Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval. Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token. We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z)
Nearest Neighbor Machine Translation [113.96357168879548]
We introduce $k$-nearest-neighbor machine translation ($k$NN-MT) It predicts tokens with a nearest neighbor classifier over a large datastore of cached examples. It consistently improves performance across many settings.
arXiv Detail & Related papers (2020-10-01T22:24:46Z)
Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation [188.3605563567253]
Non-autoregressive translation (NAT) achieves faster inference speed but at the cost of worse accuracy compared with autoregressive translation (AT) We introduce semi-autoregressive translation (SAT) as intermediate tasks. SAT covers AT and NAT as its special cases. We design curriculum schedules to gradually shift k from 1 to N, with different pacing functions and number of tasks trained at the same time. Experiments on IWSLT14 De-En, IWSLT16 En-De, WMT14 En-De and De-En datasets show that TCL-NAT achieves significant accuracy improvements over previous NAT baseline
arXiv Detail & Related papers (2020-07-17T06:06:54Z)
LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention [54.18121922040521]
Non-autoregressive translation (NAT) models generate multiple tokens in one forward pass. These NAT models often suffer from the multimodality problem, generating duplicated tokens or missing tokens. We propose two novel methods to address this issue, the Look-Around (LA) strategy and the Vocabulary Attention (VA) mechanism.
arXiv Detail & Related papers (2020-02-08T04:11:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.