Rethinking Translation Memory Augmented Neural Machine Translation
- URL: http://arxiv.org/abs/2306.06948v1
- Date: Mon, 12 Jun 2023 08:32:04 GMT
- Title: Rethinking Translation Memory Augmented Neural Machine Translation
- Authors: Hongkun Hao, Guoping Huang, Lemao Liu, Zhirui Zhang, Shuming Shi, Rui
Wang
- Abstract summary: We show that TM-augmented NMT is good at the ability of fitting data but sensitive to fluctuations in the training data.
We propose a simple yet effective TM-augmented NMT model to promote the variance and address the contradictory phenomenon.
Extensive experiments show that the proposed TM-augmented NMT achieves consistent gains over both conventional NMT and existing TM-augmented NMT.
- Score: 45.21285869556621
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper rethinks translation memory augmented neural machine translation
(TM-augmented NMT) from two perspectives, i.e., a probabilistic view of
retrieval and the variance-bias decomposition principle. The finding
demonstrates that TM-augmented NMT is good at the ability of fitting data
(i.e., lower bias) but is more sensitive to the fluctuations in the training
data (i.e., higher variance), which provides an explanation to a recently
reported contradictory phenomenon on the same translation task: TM-augmented
NMT substantially advances vanilla NMT under the high-resource scenario whereas
it fails under the low-resource scenario. Then we propose a simple yet
effective TM-augmented NMT model to promote the variance and address the
contradictory phenomenon. Extensive experiments show that the proposed
TM-augmented NMT achieves consistent gains over both conventional NMT and
existing TM-augmented NMT under two variance-preferable (low-resource and
plug-and-play) scenarios as well as the high-resource scenario.
Related papers
- Towards Reliable Neural Machine Translation with Consistency-Aware
Meta-Learning [24.64700139151659]
Current Neural machine translation (NMT) systems suffer from a lack of reliability.
We present a consistency-aware meta-learning (CAML) framework derived from the model-agnostic meta-learning (MAML) algorithm to address it.
We conduct experiments on the NIST Chinese to English task, three WMT translation tasks, and the TED M2O task.
arXiv Detail & Related papers (2023-03-20T09:41:28Z) - Prompting Neural Machine Translation with Translation Memories [32.5633128085849]
We present a simple but effective method to introduce TMs into neural machine translation (NMT) systems.
Specifically, we treat TMs as prompts to the NMT model at test time, but leave the training process unchanged.
The result is a slight update of an existing NMT system, which can be implemented in a few hours by anyone who is familiar with NMT.
arXiv Detail & Related papers (2023-01-13T03:33:26Z) - Neural Machine Translation with Contrastive Translation Memories [71.86990102704311]
Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios.
We propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence.
In training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence.
arXiv Detail & Related papers (2022-12-06T17:10:17Z) - Improving Robustness of Retrieval Augmented Translation via Shuffling of
Suggestions [15.845071122977158]
We show that for existing retrieval augmented translation methods, using a TM with a domain mismatch to the test set can result in substantially worse performance compared to not using a TM at all.
We propose a simple method to expose fuzzy-match NMT systems during training and show that it results in a system that is much more tolerant (regaining up to 5.8 BLEU) to inference with TMs with domain mismatch.
arXiv Detail & Related papers (2022-10-11T00:09:51Z) - Generating Authentic Adversarial Examples beyond Meaning-preserving with
Doubly Round-trip Translation [64.16077929617119]
We propose a new criterion for NMT adversarial examples based on the Doubly Round-Trip Translation (DRTT)
To enhance the robustness of the NMT model, we introduce the masked language models to construct bilingual adversarial pairs.
arXiv Detail & Related papers (2022-04-19T06:15:27Z) - Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences
on Neural Machine Translation [14.645468999921961]
We analyze the impact of different types of fine-grained semantic divergences on Transformer models.
We introduce a divergent-aware NMT framework that uses factors to help NMT recover from the degradation caused by naturally occurring divergences.
arXiv Detail & Related papers (2021-05-31T16:15:35Z) - Prevent the Language Model from being Overconfident in Neural Machine
Translation [21.203435303812043]
We propose a Margin-based Objective (MTO) and a Margin-based Sentencelevel Objective (MSO) to maximize the Margin for preventing the LM from being overconfident.
Experiments on WMT14 English-to-German, WMT19 Chinese-to-English, and WMT14 English-to-French translation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2021-05-24T05:34:09Z) - On the Inference Calibration of Neural Machine Translation [54.48932804996506]
We study the correlation between calibration and translation performance and linguistic properties of miscalibration.
We propose a new graduated label smoothing method that can improve both inference calibration and translation performance.
arXiv Detail & Related papers (2020-05-03T02:03:56Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z) - Understanding Learning Dynamics for Neural Machine Translation [53.23463279153577]
We propose to understand learning dynamics of NMT by using Loss Change Allocation (LCA)citeplan 2019-loss-change-allocation.
As LCA requires calculating the gradient on an entire dataset for each update, we instead present an approximate to put it into practice in NMT scenario.
Our simulated experiment shows that such approximate calculation is efficient and is empirically proved to deliver consistent results.
arXiv Detail & Related papers (2020-04-05T13:32:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.