Anticipating Future with Large Language Model for Simultaneous Machine Translation
- URL: http://arxiv.org/abs/2410.22499v1
- Date: Tue, 29 Oct 2024 19:42:30 GMT
- Title: Anticipating Future with Large Language Model for Simultaneous Machine Translation
- Authors: Siqi Ouyang, Oleksii Hrinchuk, Zhehuai Chen, Vitaly Lavrukhin, Jagadeesh Balam, Lei Li, Boris Ginsburg,
- Abstract summary: Simultaneous machine translation (SMT) takes streaming input utterances and incrementally produces target text.
We propose $textbfT$ranslation by $textbfA$nticipating $textbfF$uture (TAF)
Its core idea is to use a large language model (LLM) to predict future source words and opportunistically translate without introducing too much risk.
- Score: 27.613918824789877
- License:
- Abstract: Simultaneous machine translation (SMT) takes streaming input utterances and incrementally produces target text. Existing SMT methods only use the partial utterance that has already arrived at the input and the generated hypothesis. Motivated by human interpreters' technique to forecast future words before hearing them, we propose $\textbf{T}$ranslation by $\textbf{A}$nticipating $\textbf{F}$uture (TAF), a method to improve translation quality while retraining low latency. Its core idea is to use a large language model (LLM) to predict future source words and opportunistically translate without introducing too much risk. We evaluate our TAF and multiple baselines of SMT on four language directions. Experiments show that TAF achieves the best translation quality-latency trade-off and outperforms the baselines by up to 5 BLEU points at the same latency (three words).
Related papers
- Language Model is a Branch Predictor for Simultaneous Machine
Translation [73.82754138171587]
We propose incorporating branch prediction techniques in SiMT tasks to reduce translation latency.
We utilize a language model as a branch predictor to predict potential branch directions.
When the actual source word deviates from the predicted source word, we use the real source word to decode the output again, replacing the predicted output.
arXiv Detail & Related papers (2023-12-22T07:32:47Z) - CBSiMT: Mitigating Hallucination in Simultaneous Machine Translation
with Weighted Prefix-to-Prefix Training [13.462260072313894]
Simultaneous machine translation (SiMT) is a challenging task that requires starting translation before the full source sentence is available.
Prefix-to- framework is often applied to SiMT, which learns to predict target tokens using only a partial source prefix.
We propose a Confidence-Based Simultaneous Machine Translation framework, which uses model confidence to perceive hallucination tokens.
arXiv Detail & Related papers (2023-11-07T02:44:45Z) - Improving speech translation by fusing speech and text [24.31233927318388]
We harness the complementary strengths of speech and text, which are disparate modalities.
We propose textbfFuse-textbfSpeech-textbfText (textbfFST), a cross-modal model which supports three distinct input modalities for translation.
arXiv Detail & Related papers (2023-05-23T13:13:48Z) - Anticipation-free Training for Simultaneous Translation [70.85761141178597]
Simultaneous translation (SimulMT) speeds up the translation process by starting to translate before the source sentence is completely available.
Existing methods increase latency or introduce adaptive read-write policies for SimulMT models to handle local reordering and improve translation quality.
We propose a new framework that decomposes the translation process into the monotonic translation step and the reordering step.
arXiv Detail & Related papers (2022-01-30T16:29:37Z) - Emergent Communication Pretraining for Few-Shot Machine Translation [66.48990742411033]
We pretrain neural networks via emergent communication from referential games.
Our key assumption is that grounding communication on images---as a crude approximation of real-world environments---inductively biases the model towards learning natural languages.
arXiv Detail & Related papers (2020-11-02T10:57:53Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - Language Model Prior for Low-Resource Neural Machine Translation [85.55729693003829]
We propose a novel approach to incorporate a LM as prior in a neural translation model (TM)
We add a regularization term, which pushes the output distributions of the TM to be probable under the LM prior.
Results on two low-resource machine translation datasets show clear improvements even with limited monolingual data.
arXiv Detail & Related papers (2020-04-30T16:29:56Z) - Modeling Future Cost for Neural Machine Translation [62.427034890537676]
We propose a simple and effective method to model the future cost of each target word for NMT systems.
The proposed approach achieves significant improvements over strong Transformer-based NMT baseline.
arXiv Detail & Related papers (2020-02-28T05:37:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.