Context Consistency between Training and Testing in Simultaneous Machine
Translation
- URL: http://arxiv.org/abs/2311.07066v1
- Date: Mon, 13 Nov 2023 04:11:32 GMT
- Title: Context Consistency between Training and Testing in Simultaneous Machine
Translation
- Authors: Meizhi Zhong, Lemao Liu, Kehai Chen, Mingming Yang, Min Zhang
- Abstract summary: Simultaneous Machine Translation (SiMT) aims to yield a real-time partial translation with a monotonically growing the source-side context.
There is a counterintuitive phenomenon about the context usage between training and testing.
We propose an effective training approach called context consistency training accordingly.
- Score: 46.38890241793453
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Simultaneous Machine Translation (SiMT) aims to yield a real-time partial
translation with a monotonically growing the source-side context. However,
there is a counterintuitive phenomenon about the context usage between training
and testing: e.g., the wait-k testing model consistently trained with wait-k is
much worse than that model inconsistently trained with wait-k' (k' is not equal
to k) in terms of translation quality. To this end, we first investigate the
underlying reasons behind this phenomenon and uncover the following two
factors: 1) the limited correlation between translation quality and training
(cross-entropy) loss; 2) exposure bias between training and testing. Based on
both reasons, we then propose an effective training approach called context
consistency training accordingly, which makes consistent the context usage
between training and testing by optimizing translation quality and latency as
bi-objectives and exposing the predictions to the model during the training.
The experiments on three language pairs demonstrate our intuition: our system
encouraging context consistency outperforms that existing systems with context
inconsistency for the first time, with the help of our context consistency
training approach.
Related papers
- LEAPT: Learning Adaptive Prefix-to-prefix Translation For Simultaneous
Machine Translation [6.411228564798412]
Simultaneous machine translation is useful in many live scenarios but very challenging due to the trade-off between accuracy and latency.
We propose a novel adaptive training policy called LEAPT, which allows our machine translation model to learn how to translate source prefixes and make use of the future context.
arXiv Detail & Related papers (2023-03-21T11:17:37Z) - Understanding and Mitigating the Uncertainty in Zero-Shot Translation [92.25357943169601]
We aim to understand and alleviate the off-target issues from the perspective of uncertainty in zero-shot translation.
We propose two lightweight and complementary approaches to denoise the training data for model training.
Our approaches significantly improve the performance of zero-shot translation over strong MNMT baselines.
arXiv Detail & Related papers (2022-05-20T10:29:46Z) - Data-Driven Adaptive Simultaneous Machine Translation [51.01779863078624]
We propose a novel and efficient training scheme for adaptive SimulMT.
Our method outperforms all strong baselines in terms of translation quality and latency.
arXiv Detail & Related papers (2022-04-27T02:40:21Z) - Understanding and Improving Sequence-to-Sequence Pretraining for Neural
Machine Translation [48.50842995206353]
We study the impact of the jointly pretrained decoder, which is the main difference between Seq2Seq pretraining and previous encoder-based pretraining approaches for NMT.
We propose simple and effective strategies, named in-domain pretraining and input adaptation to remedy the domain and objective discrepancies.
arXiv Detail & Related papers (2022-03-16T07:36:28Z) - Bridging the Data Gap between Training and Inference for Unsupervised
Neural Machine Translation [49.916963624249355]
A UNMT model is trained on the pseudo parallel data with translated source, and natural source sentences in inference.
The source discrepancy between training and inference hinders the translation performance of UNMT models.
We propose an online self-training approach, which simultaneously uses the pseudo parallel data natural source, translated target to mimic the inference scenario.
arXiv Detail & Related papers (2022-03-16T04:50:27Z) - An Explanation of In-context Learning as Implicit Bayesian Inference [117.19809377740188]
We study the role of the pretraining distribution on the emergence of in-context learning.
We prove that in-context learning occurs implicitly via Bayesian inference of the latent concept.
We empirically find that scaling model size improves in-context accuracy even when the pretraining loss is the same.
arXiv Detail & Related papers (2021-11-03T09:12:33Z) - Neural Simultaneous Speech Translation Using Alignment-Based Chunking [4.224809458327515]
In simultaneous machine translation, the objective is to determine when to produce a partial translation given a continuous stream of source words.
We propose a neural machine translation (NMT) model that makes dynamic decisions when to continue feeding on input or generate output words.
Our results on the IWSLT 2020 English-to-German task outperform a wait-k baseline by 2.6 to 3.7% BLEU absolute.
arXiv Detail & Related papers (2020-05-29T10:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.