Learning Optimal Policy for Simultaneous Machine Translation via Binary
Search
- URL: http://arxiv.org/abs/2305.12774v3
- Date: Sat, 27 May 2023 15:27:40 GMT
- Title: Learning Optimal Policy for Simultaneous Machine Translation via Binary
Search
- Authors: Shoutao Guo, Shaolei Zhang, Yang Feng
- Abstract summary: Simultaneous machine translation (SiMT) starts to output translation while reading the source sentence.
The policy determines the number of source tokens read during the translation of each target token.
We present a new method for constructing the optimal policy online via binary search.
- Score: 17.802607889752736
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Simultaneous machine translation (SiMT) starts to output translation while
reading the source sentence and needs a precise policy to decide when to output
the generated translation. Therefore, the policy determines the number of
source tokens read during the translation of each target token. However, it is
difficult to learn a precise translation policy to achieve good latency-quality
trade-offs, because there is no golden policy corresponding to parallel
sentences as explicit supervision. In this paper, we present a new method for
constructing the optimal policy online via binary search. By employing explicit
supervision, our approach enables the SiMT model to learn the optimal policy,
which can guide the model in completing the translation during inference.
Experiments on four translation tasks show that our method can exceed strong
baselines across all latency scenarios.
Related papers
- TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Agent-SiMT: Agent-assisted Simultaneous Machine Translation with Large Language Models [38.49925017512848]
Simultaneous Machine Translation (SiMT) generates target translations while reading the source sentence.
Existing SiMT methods generally adopt the traditional Transformer architecture, which concurrently determines the policy and generates translations.
We introduce Agent-SiMT, a framework combining the strengths of Large Language Models (LLMs) and traditional SiMT methods.
arXiv Detail & Related papers (2024-06-11T03:09:20Z) - SiLLM: Large Language Models for Simultaneous Machine Translation [41.303764786790616]
Simultaneous Machine Translation (SiMT) generates translations while reading the source sentence.
Existing SiMT methods employ a single model to concurrently determine the policy and generate the translations.
We propose SiLLM, which delegates the two sub-tasks to separate agents.
arXiv Detail & Related papers (2024-02-20T14:23:34Z) - Adaptive Policy with Wait-$k$ Model for Simultaneous Translation [20.45004823667775]
Simultaneous machine translation (SiMT) requires a robust read/write policy in conjunction with a high-quality translation model.
Traditional methods rely on either a fixed wait-$k$ policy coupled with a standalone wait-$k$ translation model, or an adaptive policy jointly trained with the translation model.
We propose a more flexible approach by decoupling the adaptive policy model from the translation model.
arXiv Detail & Related papers (2023-10-23T12:16:32Z) - LEAPT: Learning Adaptive Prefix-to-prefix Translation For Simultaneous
Machine Translation [6.411228564798412]
Simultaneous machine translation is useful in many live scenarios but very challenging due to the trade-off between accuracy and latency.
We propose a novel adaptive training policy called LEAPT, which allows our machine translation model to learn how to translate source prefixes and make use of the future context.
arXiv Detail & Related papers (2023-03-21T11:17:37Z) - Principled Paraphrase Generation with Parallel Corpora [52.78059089341062]
We formalize the implicit similarity function induced by round-trip Machine Translation.
We show that it is susceptible to non-paraphrase pairs sharing a single ambiguous translation.
We design an alternative similarity metric that mitigates this issue.
arXiv Detail & Related papers (2022-05-24T17:22:42Z) - Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech
Translation [75.86581380817464]
A SimulST system generally includes two components: the pre-decision that aggregates the speech information and the policy that decides to read or write.
This paper proposes to model the adaptive policy by adapting the Continuous Integrate-and-Fire (CIF)
Compared with monotonic multihead attention (MMA), our method has the advantage of simpler computation, superior quality at low latency, and better generalization to long utterances.
arXiv Detail & Related papers (2022-03-22T23:33:18Z) - Universal Simultaneous Machine Translation with Mixture-of-Experts
Wait-k Policy [6.487736084189248]
Simultaneous machine translation (SiMT) generates translation before reading the entire source sentence.
Previous methods usually need to train multiple SiMT models for different latency levels, resulting in large computational costs.
We propose a universal SiMT model with Mixture-of-Experts Wait-k Policy to achieve the best translation quality under arbitrary latency.
arXiv Detail & Related papers (2021-09-11T09:43:15Z) - Meta Back-translation [111.87397401837286]
We propose a novel method to generate pseudo-parallel data from a pre-trained back-translation model.
Our method is a meta-learning algorithm which adapts a pre-trained back-translation model so that the pseudo-parallel data it generates would train a forward-translation model to do well on a validation set.
arXiv Detail & Related papers (2021-02-15T20:58:32Z) - Unsupervised Bitext Mining and Translation via Self-trained Contextual
Embeddings [51.47607125262885]
We describe an unsupervised method to create pseudo-parallel corpora for machine translation (MT) from unaligned text.
We use multilingual BERT to create source and target sentence embeddings for nearest-neighbor search and adapt the model via self-training.
We validate our technique by extracting parallel sentence pairs on the BUCC 2017 bitext mining task and observe up to a 24.5 point increase (absolute) in F1 scores over previous unsupervised methods.
arXiv Detail & Related papers (2020-10-15T14:04:03Z) - Learning Coupled Policies for Simultaneous Machine Translation using
Imitation Learning [85.70547744787]
We present an approach to efficiently learn a simultaneous translation model with coupled programmer-interpreter policies.
Experiments on six language-pairs show our method outperforms strong baselines in terms of translation quality.
arXiv Detail & Related papers (2020-02-11T10:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.