From Simultaneous to Streaming Machine Translation by Leveraging
Streaming History
- URL: http://arxiv.org/abs/2203.02459v1
- Date: Fri, 4 Mar 2022 17:41:45 GMT
- Title: From Simultaneous to Streaming Machine Translation by Leveraging
Streaming History
- Authors: Javier Iranzo-S\'anchez and Jorge Civera and Alfons Juan
- Abstract summary: Simultaneous Machine Translation is the task of incrementally translating an input sentence before it is fully available.
Streaming MT can be understood as an extension of Simultaneous MT to the incremental translation of a continuous input text stream.
In this work, a state-of-the-art simultaneous sentence-level MT system is extended to the streaming setup by leveraging the streaming history.
- Score: 4.831134508326648
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Simultaneous Machine Translation is the task of incrementally translating an
input sentence before it is fully available. Currently, simultaneous
translation is carried out by translating each sentence independently of the
previously translated text. More generally, Streaming MT can be understood as
an extension of Simultaneous MT to the incremental translation of a continuous
input text stream. In this work, a state-of-the-art simultaneous sentence-level
MT system is extended to the streaming setup by leveraging the streaming
history. Extensive empirical results are reported on IWSLT Translation Tasks,
showing that leveraging the streaming history leads to significant quality
gains. In particular, the proposed system proves to compare favorably to the
best performing systems.
Related papers
- TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Speech Translation with Large Language Models: An Industrial Practice [64.5419534101104]
We introduce LLM-ST, a novel and effective speech translation model constructed upon a pre-trained large language model (LLM)
By integrating the large language model (LLM) with a speech encoder and employing multi-task instruction tuning, LLM-ST can produce accurate timestamped transcriptions and translations.
Through rigorous experimentation on English and Chinese datasets, we showcase the exceptional performance of LLM-ST.
arXiv Detail & Related papers (2023-12-21T05:32:49Z) - Leveraging Timestamp Information for Serialized Joint Streaming
Recognition and Translation [51.399695200838586]
We propose a streaming Transformer-Transducer (T-T) model able to jointly produce many-to-one and one-to-many transcription and translation using a single decoder.
Experiments on it,es,de->en prove the effectiveness of our approach, enabling the generation of one-to-many joint outputs with a single decoder for the first time.
arXiv Detail & Related papers (2023-10-23T11:00:27Z) - DiariST: Streaming Speech Translation with Speaker Diarization [53.595990270899414]
We propose DiariST, the first streaming ST and SD solution.
It is built upon a neural transducer-based streaming ST system and integrates token-level serialized output training and t-vector.
Our system achieves a strong ST and SD capability compared to offline systems based on Whisper, while performing streaming inference for overlapping speech.
arXiv Detail & Related papers (2023-09-14T19:33:27Z) - STEMM: Self-learning with Speech-text Manifold Mixup for Speech
Translation [37.51435498386953]
We propose the Speech-TExt Manifold Mixup (STEMM) method to calibrate such discrepancy.
Experiments on MuST-C speech translation benchmark and further analysis show that our method effectively alleviates the cross-modal representation discrepancy.
arXiv Detail & Related papers (2022-03-20T01:49:53Z) - The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at
IWSLT 2021 [36.95800637790494]
This paper describes USTC-NELSLIP's submissions to the IWSLT 2021 Simultaneous Speech Translation task.
We proposed a novel simultaneous translation model, Cross Attention Augmented Transducer (CAAT), which extends conventional RNN-T to sequence-to-sequence tasks.
Experiments on speech-to-text (S2T) and text-to-text (T2T) simultaneous translation tasks shows CAAT achieves better quality-latency trade-offs.
arXiv Detail & Related papers (2021-07-01T08:09:00Z) - Stream-level Latency Evaluation for Simultaneous Machine Translation [5.50178437495268]
Simultaneous machine translation has recently gained traction thanks to significant quality improvements and the advent of streaming applications.
This work proposes a stream-level adaptation of the current latency measures based on a re-segmentation approach applied to the output translation.
arXiv Detail & Related papers (2021-04-18T11:16:17Z) - Streaming Simultaneous Speech Translation with Augmented Memory
Transformer [29.248366441276662]
Transformer-based models have achieved state-of-the-art performance on speech translation tasks.
We propose an end-to-end transformer-based sequence-to-sequence model, equipped with an augmented memory transformer encoder.
arXiv Detail & Related papers (2020-10-30T18:28:42Z) - SimulEval: An Evaluation Toolkit for Simultaneous Translation [59.02724214432792]
Simultaneous translation on both text and speech focuses on a real-time and low-latency scenario.
SimulEval is an easy-to-use and general evaluation toolkit for both simultaneous text and speech translation.
arXiv Detail & Related papers (2020-07-31T17:44:41Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z) - Re-translation versus Streaming for Simultaneous Translation [14.800214853561823]
We study a problem in which revisions to the hypothesis beyond strictly appending words are permitted.
In this setting, we compare custom streaming approaches to re-translation.
We find re-translation to be as good or better than state-of-the-art streaming systems.
arXiv Detail & Related papers (2020-04-07T18:27:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.