PsFuture: A Pseudo-Future-based Zero-Shot Adaptive Policy for Simultaneous Machine Translation
- URL: http://arxiv.org/abs/2410.04075v1
- Date: Sat, 5 Oct 2024 08:06:33 GMT
- Title: PsFuture: A Pseudo-Future-based Zero-Shot Adaptive Policy for Simultaneous Machine Translation
- Authors: Libo Zhao, Jing Li, Ziqian Zeng,
- Abstract summary: Simultaneous Machine Translation (SiMT) requires target tokens to be generated in real-time as streaming source tokens are consumed.
We propose PsFuture, the first zero-shot adaptive read/write policy for SiMT.
We introduce a novel training strategy, Prefix-to-Full (P2F), specifically tailored to adjust offline translation models for SiMT applications.
- Score: 8.1299957975257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Simultaneous Machine Translation (SiMT) requires target tokens to be generated in real-time as streaming source tokens are consumed. Traditional approaches to SiMT typically require sophisticated architectures and extensive parameter configurations for training adaptive read/write policies, which in turn demand considerable computational power and memory. We propose PsFuture, the first zero-shot adaptive read/write policy for SiMT, enabling the translation model to independently determine read/write actions without the necessity for additional training. Furthermore, we introduce a novel training strategy, Prefix-to-Full (P2F), specifically tailored to adjust offline translation models for SiMT applications, exploiting the advantages of the bidirectional attention mechanism inherent in offline models. Experiments across multiple benchmarks demonstrate that our zero-shot policy attains performance on par with strong baselines and the P2F method can further enhance performance, achieving an outstanding trade-off between translation quality and latency.
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - Self-Modifying State Modeling for Simultaneous Machine Translation [25.11963998838586]
Simultaneous Machine Translation (SiMT) generates target outputs while receiving stream source inputs.
Existing SiMT methods, which learn the policy by exploring various decision paths in training, face inherent limitations.
We propose textbfSelf-textbfModifying textbfState textbfModeling (SM$2$), a novel training paradigm for SiMT task.
arXiv Detail & Related papers (2024-06-04T11:57:58Z) - Unleashing the Power of Pre-trained Language Models for Offline
Reinforcement Learning [54.682106515794864]
offline reinforcement learning (RL) aims to find a near-optimal policy using pre-collected datasets.
This paper introduces $textbfLanguage Models for $textbfMo$tion Control ($textbfLaMo$), a general framework based on Decision Transformers to use pre-trained Language Models (LMs) for offline RL.
Empirical results indicate $textbfLaMo$ achieves state-of-the-art performance in sparse-reward tasks.
arXiv Detail & Related papers (2023-10-31T16:24:17Z) - Adaptive Policy with Wait-$k$ Model for Simultaneous Translation [20.45004823667775]
Simultaneous machine translation (SiMT) requires a robust read/write policy in conjunction with a high-quality translation model.
Traditional methods rely on either a fixed wait-$k$ policy coupled with a standalone wait-$k$ translation model, or an adaptive policy jointly trained with the translation model.
We propose a more flexible approach by decoupling the adaptive policy model from the translation model.
arXiv Detail & Related papers (2023-10-23T12:16:32Z) - Simultaneous Machine Translation with Tailored Reference [35.46823126036308]
Simultaneous machine translation (SiMT) generates translation while reading the whole source sentence.
Existing SiMT models are typically trained using the same reference disregarding the varying amounts of available source information at different latency.
We propose a novel method that provides tailored reference for the SiMT models trained at different latency by rephrasing the ground-truth.
arXiv Detail & Related papers (2023-10-20T15:32:26Z) - Glancing Future for Simultaneous Machine Translation [35.46823126036308]
We propose a novel method to bridge the gap between the prefix2 training and seq2seq training.
We gradually reduce the available source information from the whole sentence to the prefix corresponding to that latency.
Our method is applicable to a wide range of SiMT methods and experiments demonstrate that our method outperforms strong baselines.
arXiv Detail & Related papers (2023-09-12T12:46:20Z) - LEAPT: Learning Adaptive Prefix-to-prefix Translation For Simultaneous
Machine Translation [6.411228564798412]
Simultaneous machine translation is useful in many live scenarios but very challenging due to the trade-off between accuracy and latency.
We propose a novel adaptive training policy called LEAPT, which allows our machine translation model to learn how to translate source prefixes and make use of the future context.
arXiv Detail & Related papers (2023-03-21T11:17:37Z) - Data-Driven Adaptive Simultaneous Machine Translation [51.01779863078624]
We propose a novel and efficient training scheme for adaptive SimulMT.
Our method outperforms all strong baselines in terms of translation quality and latency.
arXiv Detail & Related papers (2022-04-27T02:40:21Z) - Improving Neural Machine Translation by Denoising Training [95.96569884410137]
We present a simple and effective pretraining strategy Denoising Training DoT for neural machine translation.
We update the model parameters with source- and target-side denoising tasks at the early stage and then tune the model normally.
Experiments show DoT consistently improves the neural machine translation performance across 12 bilingual and 16 multilingual directions.
arXiv Detail & Related papers (2022-01-19T00:11:38Z) - Source and Target Bidirectional Knowledge Distillation for End-to-end
Speech Translation [88.78138830698173]
We focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models.
We train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder.
arXiv Detail & Related papers (2021-04-13T19:00:51Z) - On Learning Text Style Transfer with Direct Rewards [101.97136885111037]
Lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task.
We leverage semantic similarity metrics originally used for fine-tuning neural machine translation models.
Our model provides significant gains in both automatic and human evaluation over strong baselines.
arXiv Detail & Related papers (2020-10-24T04:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.