Adaptive Policy with Wait-$k$ Model for Simultaneous Translation
- URL: http://arxiv.org/abs/2310.14853v1
- Date: Mon, 23 Oct 2023 12:16:32 GMT
- Title: Adaptive Policy with Wait-$k$ Model for Simultaneous Translation
- Authors: Libo Zhao, Kai Fan, Wei Luo, Jing Wu, Shushu Wang, Ziqian Zeng,
Zhongqiang Huang
- Abstract summary: Simultaneous machine translation (SiMT) requires a robust read/write policy in conjunction with a high-quality translation model.
Traditional methods rely on either a fixed wait-$k$ policy coupled with a standalone wait-$k$ translation model, or an adaptive policy jointly trained with the translation model.
We propose a more flexible approach by decoupling the adaptive policy model from the translation model.
- Score: 20.45004823667775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Simultaneous machine translation (SiMT) requires a robust read/write policy
in conjunction with a high-quality translation model. Traditional methods rely
on either a fixed wait-$k$ policy coupled with a standalone wait-$k$
translation model, or an adaptive policy jointly trained with the translation
model. In this study, we propose a more flexible approach by decoupling the
adaptive policy model from the translation model. Our motivation stems from the
observation that a standalone multi-path wait-$k$ model performs competitively
with adaptive policies utilized in state-of-the-art SiMT approaches.
Specifically, we introduce DaP, a divergence-based adaptive policy, that makes
read/write decisions for any translation model based on the potential
divergence in translation distributions resulting from future information. DaP
extends a frozen wait-$k$ model with lightweight parameters, and is both memory
and computation efficient. Experimental results across various benchmarks
demonstrate that our approach offers an improved trade-off between translation
accuracy and latency, outperforming strong baselines.
Related papers
- PsFuture: A Pseudo-Future-based Zero-Shot Adaptive Policy for Simultaneous Machine Translation [8.1299957975257]
Simultaneous Machine Translation (SiMT) requires target tokens to be generated in real-time as streaming source tokens are consumed.
We propose PsFuture, the first zero-shot adaptive read/write policy for SiMT.
We introduce a novel training strategy, Prefix-to-Full (P2F), specifically tailored to adjust offline translation models for SiMT applications.
arXiv Detail & Related papers (2024-10-05T08:06:33Z) - Fixed and Adaptive Simultaneous Machine Translation Strategies Using Adapters [5.312303275762104]
Simultaneous machine translation aims at solving the task of real-time translation by starting to translate before consuming the full input.
wait-$k$ policy offers a solution by starting to translate after consuming $k$ words.
In this paper, we address the challenge of building one model that can fulfil multiple latency levels.
arXiv Detail & Related papers (2024-07-18T12:42:45Z) - Learning Optimal Policy for Simultaneous Machine Translation via Binary
Search [17.802607889752736]
Simultaneous machine translation (SiMT) starts to output translation while reading the source sentence.
The policy determines the number of source tokens read during the translation of each target token.
We present a new method for constructing the optimal policy online via binary search.
arXiv Detail & Related papers (2023-05-22T07:03:06Z) - LEAPT: Learning Adaptive Prefix-to-prefix Translation For Simultaneous
Machine Translation [6.411228564798412]
Simultaneous machine translation is useful in many live scenarios but very challenging due to the trade-off between accuracy and latency.
We propose a novel adaptive training policy called LEAPT, which allows our machine translation model to learn how to translate source prefixes and make use of the future context.
arXiv Detail & Related papers (2023-03-21T11:17:37Z) - Continual Knowledge Distillation for Neural Machine Translation [74.03622486218597]
parallel corpora are not publicly accessible for data copyright, data privacy and competitive differentiation reasons.
We propose a method called continual knowledge distillation to take advantage of existing translation models to improve one model of interest.
arXiv Detail & Related papers (2022-12-18T14:41:13Z) - Principled Paraphrase Generation with Parallel Corpora [52.78059089341062]
We formalize the implicit similarity function induced by round-trip Machine Translation.
We show that it is susceptible to non-paraphrase pairs sharing a single ambiguous translation.
We design an alternative similarity metric that mitigates this issue.
arXiv Detail & Related papers (2022-05-24T17:22:42Z) - Data-Driven Adaptive Simultaneous Machine Translation [51.01779863078624]
We propose a novel and efficient training scheme for adaptive SimulMT.
Our method outperforms all strong baselines in terms of translation quality and latency.
arXiv Detail & Related papers (2022-04-27T02:40:21Z) - Anticipation-free Training for Simultaneous Translation [70.85761141178597]
Simultaneous translation (SimulMT) speeds up the translation process by starting to translate before the source sentence is completely available.
Existing methods increase latency or introduce adaptive read-write policies for SimulMT models to handle local reordering and improve translation quality.
We propose a new framework that decomposes the translation process into the monotonic translation step and the reordering step.
arXiv Detail & Related papers (2022-01-30T16:29:37Z) - Universal Simultaneous Machine Translation with Mixture-of-Experts
Wait-k Policy [6.487736084189248]
Simultaneous machine translation (SiMT) generates translation before reading the entire source sentence.
Previous methods usually need to train multiple SiMT models for different latency levels, resulting in large computational costs.
We propose a universal SiMT model with Mixture-of-Experts Wait-k Policy to achieve the best translation quality under arbitrary latency.
arXiv Detail & Related papers (2021-09-11T09:43:15Z) - Meta Back-translation [111.87397401837286]
We propose a novel method to generate pseudo-parallel data from a pre-trained back-translation model.
Our method is a meta-learning algorithm which adapts a pre-trained back-translation model so that the pseudo-parallel data it generates would train a forward-translation model to do well on a validation set.
arXiv Detail & Related papers (2021-02-15T20:58:32Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.