Neural Machine Translation with Contrastive Translation Memories
- URL: http://arxiv.org/abs/2212.03140v1
- Date: Tue, 6 Dec 2022 17:10:17 GMT
- Title: Neural Machine Translation with Contrastive Translation Memories
- Authors: Xin Cheng, Shen Gao, Lemao Liu, Dongyan Zhao, Rui Yan
- Abstract summary: Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios.
We propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence.
In training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence.
- Score: 71.86990102704311
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-augmented Neural Machine Translation models have been successful in
many translation scenarios. Different from previous works that make use of
mutually similar but redundant translation memories~(TMs), we propose a new
retrieval-augmented NMT to model contrastively retrieved translation memories
that are holistically similar to the source sentence while individually
contrastive to each other providing maximal information gains in three phases.
First, in TM retrieval phase, we adopt a contrastive retrieval algorithm to
avoid redundancy and uninformativeness of similar translation pieces. Second,
in memory encoding stage, given a set of TMs we propose a novel Hierarchical
Group Attention module to gather both local context of each TM and global
context of the whole TM set. Finally, in training phase, a Multi-TM contrastive
learning objective is introduced to learn salient feature of each TM with
respect to target sentence. Experimental results show that our framework
obtains improvements over strong baselines on the benchmark datasets.
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Beyond Triplet: Leveraging the Most Data for Multimodal Machine
Translation [53.342921374639346]
Multimodal machine translation aims to improve translation quality by incorporating information from other modalities, such as vision.
Previous MMT systems mainly focus on better access and use of visual information and tend to validate their methods on image-related datasets.
This paper establishes new methods and new datasets for MMT.
arXiv Detail & Related papers (2022-12-20T15:02:38Z) - Tackling Ambiguity with Images: Improved Multimodal Machine Translation
and Contrastive Evaluation [72.6667341525552]
We present a new MMT approach based on a strong text-only MT model, which uses neural adapters and a novel guided self-attention mechanism.
We also introduce CoMMuTE, a Contrastive Multimodal Translation Evaluation set of ambiguous sentences and their possible translations.
Our approach obtains competitive results compared to strong text-only models on standard English-to-French, English-to-German and English-to-Czech benchmarks.
arXiv Detail & Related papers (2022-12-20T10:18:18Z) - Bilingual Synchronization: Restoring Translational Relationships with
Editing Operations [2.0411082897313984]
We consider a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source.
Our results suggest that one single generic edit-based system, once fine-tuned, can compare with, or even outperform, dedicated systems specifically trained for these tasks.
arXiv Detail & Related papers (2022-10-24T12:25:44Z) - Improving Robustness of Retrieval Augmented Translation via Shuffling of
Suggestions [15.845071122977158]
We show that for existing retrieval augmented translation methods, using a TM with a domain mismatch to the test set can result in substantially worse performance compared to not using a TM at all.
We propose a simple method to expose fuzzy-match NMT systems during training and show that it results in a system that is much more tolerant (regaining up to 5.8 BLEU) to inference with TMs with domain mismatch.
arXiv Detail & Related papers (2022-10-11T00:09:51Z) - STEMM: Self-learning with Speech-text Manifold Mixup for Speech
Translation [37.51435498386953]
We propose the Speech-TExt Manifold Mixup (STEMM) method to calibrate such discrepancy.
Experiments on MuST-C speech translation benchmark and further analysis show that our method effectively alleviates the cross-modal representation discrepancy.
arXiv Detail & Related papers (2022-03-20T01:49:53Z) - Neural Machine Translation with Monolingual Translation Memory [58.98657907678992]
We propose a new framework that uses monolingual memory and performs learnable memory retrieval in a cross-lingual manner.
Experiments show that the proposed method obtains substantial improvements.
arXiv Detail & Related papers (2021-05-24T13:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.