Modeling Dual Read/Write Paths for Simultaneous Machine Translation
- URL: http://arxiv.org/abs/2203.09163v1
- Date: Thu, 17 Mar 2022 08:35:36 GMT
- Title: Modeling Dual Read/Write Paths for Simultaneous Machine Translation
- Authors: Shaolei Zhang, Yang Feng
- Abstract summary: We propose a method of Dual Path SiMT which introduces duality constraints to guide the read/write path.
Experiments on En-Vi and De-En SiMT tasks show that our method can outperform strong baselines under all latency.
- Score: 21.03142288187605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Simultaneous machine translation (SiMT) outputs the translation while reading
the source sentence and hence requires a policy to determine whether to wait
for the next source word (READ) or generate a target word (WRITE), the actions
of which form a read/write path. Although the read/write path is essential to
SiMT performance, there is no direct supervision given to the path in the
existing methods. In this paper, we propose a method of Dual Path SiMT which
introduces duality constraints to guide the read/write path. According to
duality constraints, the read/write paths in source-to-target and
target-to-source SiMT models can be mapped to each other. Therefore, the SiMT
models in two directions are jointly optimized by forcing their read/write
paths to satisfy the mapping relation. Experiments on En-Vi and De-En SiMT
tasks show that our method can outperform strong baselines under all latency.
Related papers
- PsFuture: A Pseudo-Future-based Zero-Shot Adaptive Policy for Simultaneous Machine Translation [8.1299957975257]
Simultaneous Machine Translation (SiMT) requires target tokens to be generated in real-time as streaming source tokens are consumed.
We propose PsFuture, the first zero-shot adaptive read/write policy for SiMT.
We introduce a novel training strategy, Prefix-to-Full (P2F), specifically tailored to adjust offline translation models for SiMT applications.
arXiv Detail & Related papers (2024-10-05T08:06:33Z) - TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Self-Modifying State Modeling for Simultaneous Machine Translation [25.11963998838586]
Simultaneous Machine Translation (SiMT) generates target outputs while receiving stream source inputs.
Existing SiMT methods, which learn the policy by exploring various decision paths in training, face inherent limitations.
We propose textbfSelf-textbfModifying textbfState textbfModeling (SM$2$), a novel training paradigm for SiMT task.
arXiv Detail & Related papers (2024-06-04T11:57:58Z) - SiLLM: Large Language Models for Simultaneous Machine Translation [41.303764786790616]
Simultaneous Machine Translation (SiMT) generates translations while reading the source sentence.
Existing SiMT methods employ a single model to concurrently determine the policy and generate the translations.
We propose SiLLM, which delegates the two sub-tasks to separate agents.
arXiv Detail & Related papers (2024-02-20T14:23:34Z) - Text2MDT: Extracting Medical Decision Trees from Medical Texts [33.58610255918941]
We propose a novel task, Text2MDT, to explore the automatic extraction of medical decision trees (MDTs) from medical texts.
We normalize the form of the MDT and create an annotated Text-to-MDT dataset in Chinese with the participation of medical experts.
arXiv Detail & Related papers (2024-01-04T02:33:38Z) - Simultaneous Machine Translation with Tailored Reference [35.46823126036308]
Simultaneous machine translation (SiMT) generates translation while reading the whole source sentence.
Existing SiMT models are typically trained using the same reference disregarding the varying amounts of available source information at different latency.
We propose a novel method that provides tailored reference for the SiMT models trained at different latency by rephrasing the ground-truth.
arXiv Detail & Related papers (2023-10-20T15:32:26Z) - Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard
Parameter Sharing [72.56219471145232]
We propose a ST/MT multi-tasking framework with hard parameter sharing.
Our method reduces the speech-text modality gap via a pre-processing stage.
We show that our framework improves attentional encoder-decoder, Connectionist Temporal Classification (CTC), transducer, and joint CTC/attention models by an average of +0.5 BLEU.
arXiv Detail & Related papers (2023-09-27T17:48:14Z) - Beyond Triplet: Leveraging the Most Data for Multimodal Machine
Translation [53.342921374639346]
Multimodal machine translation aims to improve translation quality by incorporating information from other modalities, such as vision.
Previous MMT systems mainly focus on better access and use of visual information and tend to validate their methods on image-related datasets.
This paper establishes new methods and new datasets for MMT.
arXiv Detail & Related papers (2022-12-20T15:02:38Z) - Source and Target Bidirectional Knowledge Distillation for End-to-end
Speech Translation [88.78138830698173]
We focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models.
We train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder.
arXiv Detail & Related papers (2021-04-13T19:00:51Z) - Unsupervised Bitext Mining and Translation via Self-trained Contextual
Embeddings [51.47607125262885]
We describe an unsupervised method to create pseudo-parallel corpora for machine translation (MT) from unaligned text.
We use multilingual BERT to create source and target sentence embeddings for nearest-neighbor search and adapt the model via self-training.
We validate our technique by extracting parallel sentence pairs on the BUCC 2017 bitext mining task and observe up to a 24.5 point increase (absolute) in F1 scores over previous unsupervised methods.
arXiv Detail & Related papers (2020-10-15T14:04:03Z) - Neural Machine Translation: Challenges, Progress and Future [62.75523637241876]
Machine translation (MT) is a technique that leverages computers to translate human languages automatically.
neural machine translation (NMT) models direct mapping between source and target languages with deep neural networks.
This article makes a review of NMT framework, discusses the challenges in NMT and introduces some exciting recent progresses.
arXiv Detail & Related papers (2020-04-13T07:53:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.