Lost in Interpreting: Speech Translation from Source or Interpreter?
- URL: http://arxiv.org/abs/2106.09343v1
- Date: Thu, 17 Jun 2021 09:32:49 GMT
- Title: Lost in Interpreting: Speech Translation from Source or Interpreter?
- Authors: Dominik Mach\'a\v{c}ek, Mat\'u\v{s} \v{Z}ilinec, Ond\v{r}ej Bojar
- Abstract summary: We release 10 hours of recordings and transcripts of European Parliament speeches in English, with simultaneous interpreting into Czech and German.
We evaluate quality and latency of speaker-based and interpreter-based spoken translation systems from English to Czech.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Interpreters facilitate multi-lingual meetings but the affordable set of
languages is often smaller than what is needed. Automatic simultaneous speech
translation can extend the set of provided languages. We investigate if such an
automatic system should rather follow the original speaker, or an interpreter
to achieve better translation quality at the cost of increased delay.
To answer the question, we release Europarl Simultaneous Interpreting Corpus
(ESIC), 10 hours of recordings and transcripts of European Parliament speeches
in English, with simultaneous interpreting into Czech and German. We evaluate
quality and latency of speaker-based and interpreter-based spoken translation
systems from English to Czech. We study the differences in implicit
simplification and summarization of the human interpreter compared to a machine
translation system trained to shorten the output to some extent. Finally, we
perform human evaluation to measure information loss of each of these
approaches.
Related papers
- Towards a Deep Understanding of Multilingual End-to-End Speech
Translation [52.26739715012842]
We analyze representations learnt in a multilingual end-to-end speech translation model trained over 22 languages.
We derive three major findings from our analysis.
arXiv Detail & Related papers (2023-10-31T13:50:55Z) - SeamlessM4T: Massively Multilingual & Multimodal Machine Translation [90.71078166159295]
We introduce SeamlessM4T, a single model that supports speech-to-speech translation, speech-to-text translation, text-to-text translation, and automatic speech recognition for up to 100 languages.
We developed the first multilingual system capable of translating from and into English for both speech and text.
On FLEURS, SeamlessM4T sets a new standard for translations into multiple target languages, achieving an improvement of 20% BLEU over the previous SOTA in direct speech-to-text translation.
arXiv Detail & Related papers (2023-08-22T17:44:18Z) - Robustness of Multi-Source MT to Transcription Errors [9.045660146260467]
In a multilingual scenario, the same content may be available in various languages via simultaneous interpreting, dubbing or subtitling.
We show that on a 10-hour ESIC corpus, the ASR errors in the original English speech and its simultaneous interpreting into German and Czech are mutually independent.
Our results show that multi-source neural machine translation has the potential to be useful in a real-time simultaneous translation setting.
arXiv Detail & Related papers (2023-05-26T12:54:16Z) - Decomposed Prompting for Machine Translation Between Related Languages
using Large Language Models [55.35106713257871]
We introduce DecoMT, a novel approach of few-shot prompting that decomposes the translation process into a sequence of word chunk translations.
We show that DecoMT outperforms the strong few-shot prompting BLOOM model with an average improvement of 8 chrF++ scores across the examined languages.
arXiv Detail & Related papers (2023-05-22T14:52:47Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - Comprehension of Subtitles from Re-Translating Simultaneous Speech
Translation [0.0]
In simultaneous speech translation, one can vary the size of the output window, system latency and sometimes the allowed level of rewriting.
The effect of these properties on readability and comprehensibility has not been tested with modern neural translation systems.
It is a pilot study with 14 users on 2 hours of German documentaries or speeches with online translations into Czech.
arXiv Detail & Related papers (2022-03-04T17:41:39Z) - BSTC: A Large-Scale Chinese-English Speech Translation Dataset [26.633433687767553]
BSTC (Baidu Speech Translation Corpus) is a large-scale Chinese-English speech translation dataset.
This dataset is constructed based on a collection of licensed videos of talks or lectures, including about 68 hours of Mandarin data.
We have asked three experienced interpreters to simultaneously interpret the testing talks in a mock conference setting.
arXiv Detail & Related papers (2021-04-08T07:38:51Z) - Fluent and Low-latency Simultaneous Speech-to-Speech Translation with
Self-adaptive Training [40.71155396456831]
Simultaneous speech-to-speech translation is widely useful but extremely challenging.
It needs to generate target-language speech concurrently with the source-language speech, with only a few seconds delay.
Current approaches accumulate latencies progressively when the speaker talks faster, and introduce unnatural pauses when the speaker talks slower.
We propose Self-Adaptive Translation (SAT) which flexibly adjusts the length of translations to accommodate different source speech rates.
arXiv Detail & Related papers (2020-10-20T06:02:15Z) - "Listen, Understand and Translate": Triple Supervision Decouples
End-to-end Speech-to-text Translation [49.610188741500274]
An end-to-end speech-to-text translation (ST) takes audio in a source language and outputs the text in a target language.
Existing methods are limited by the amount of parallel corpus.
We build a system to fully utilize signals in a parallel ST corpus.
arXiv Detail & Related papers (2020-09-21T09:19:07Z) - Self-Supervised Representations Improve End-to-End Speech Translation [57.641761472372814]
We show that self-supervised pre-trained features can consistently improve the translation performance.
Cross-lingual transfer allows to extend to a variety of languages without or with little tuning.
arXiv Detail & Related papers (2020-06-22T10:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.