An Empirical Study of End-to-end Simultaneous Speech Translation
Decoding Strategies
- URL: http://arxiv.org/abs/2103.03233v1
- Date: Thu, 4 Mar 2021 18:55:40 GMT
- Title: An Empirical Study of End-to-end Simultaneous Speech Translation
Decoding Strategies
- Authors: Ha Nguyen, Yannick Est\`eve, Laurent Besacier
- Abstract summary: This paper proposes a decoding strategy for end-to-end simultaneous speech translation.
We leverage end-to-end models trained in offline mode and conduct an empirical study for two language pairs.
- Score: 17.78024523121448
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a decoding strategy for end-to-end simultaneous speech
translation. We leverage end-to-end models trained in offline mode and conduct
an empirical study for two language pairs (English-to-German and
English-to-Portuguese). We also investigate different output token
granularities including characters and Byte Pair Encoding (BPE) units. The
results show that the proposed decoding approach allows to control BLEU/Average
Lagging trade-off along different latency regimes. Our best decoding settings
achieve comparable results with a strong cascade model evaluated on the
simultaneous translation track of IWSLT 2020 shared task.
Related papers
- Multilingual Contrastive Decoding via Language-Agnostic Layers Skipping [60.458273797431836]
Decoding by contrasting layers (DoLa) is designed to improve the generation quality of large language models.
We find that this approach does not work well on non-English tasks.
Inspired by previous interpretability work on language transition during the model's forward pass, we propose an improved contrastive decoding algorithm.
arXiv Detail & Related papers (2024-07-15T15:14:01Z) - Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding [46.485363806259265]
Speculative Decoding has emerged as a novel decoding paradigm for Large Language Models (LLMs) inference.
In each decoding step, this method first drafts several future tokens efficiently and then verifies them in parallel.
This paper presents a comprehensive overview and analysis of this promising decoding paradigm.
arXiv Detail & Related papers (2024-01-15T17:26:50Z) - Dual-Alignment Pre-training for Cross-lingual Sentence Embedding [79.98111074307657]
We propose a dual-alignment pre-training (DAP) framework for cross-lingual sentence embedding.
We introduce a novel representation translation learning (RTL) task, where the model learns to use one-side contextualized token representation to reconstruct its translation counterpart.
Our approach can significantly improve sentence embedding.
arXiv Detail & Related papers (2023-05-16T03:53:30Z) - VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Impact of Encoding and Segmentation Strategies on End-to-End
Simultaneous Speech Translation [17.78024523121448]
This paper investigates two key aspects of end-to-end simultaneous speech translation: how to encode efficiently the continuous speech flow and how to segment the speech flow.
We extend our previously proposed end-to-end online decoding strategy and show that while replacing BLSTM by ULSTM encoding degrades performance in offline mode, it actually improves both efficiency and performance in online mode.
arXiv Detail & Related papers (2021-04-29T16:33:08Z) - Bridging the Modality Gap for Speech-to-Text Translation [57.47099674461832]
End-to-end speech translation aims to translate speech in one language into text in another language via an end-to-end way.
Most existing methods employ an encoder-decoder structure with a single encoder to learn acoustic representation and semantic information simultaneously.
We propose a Speech-to-Text Adaptation for Speech Translation model which aims to improve the end-to-end model performance by bridging the modality gap between speech and text.
arXiv Detail & Related papers (2020-10-28T12:33:04Z) - SimulEval: An Evaluation Toolkit for Simultaneous Translation [59.02724214432792]
Simultaneous translation on both text and speech focuses on a real-time and low-latency scenario.
SimulEval is an easy-to-use and general evaluation toolkit for both simultaneous text and speech translation.
arXiv Detail & Related papers (2020-07-31T17:44:41Z) - Learning Coupled Policies for Simultaneous Machine Translation using
Imitation Learning [85.70547744787]
We present an approach to efficiently learn a simultaneous translation model with coupled programmer-interpreter policies.
Experiments on six language-pairs show our method outperforms strong baselines in terms of translation quality.
arXiv Detail & Related papers (2020-02-11T10:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.