The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at
IWSLT 2021
- URL: http://arxiv.org/abs/2107.00279v1
- Date: Thu, 1 Jul 2021 08:09:00 GMT
- Title: The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at
IWSLT 2021
- Authors: Dan Liu, Mengge Du, Xiaoxi Li, Yuchen Hu, Lirong Dai
- Abstract summary: This paper describes USTC-NELSLIP's submissions to the IWSLT 2021 Simultaneous Speech Translation task.
We proposed a novel simultaneous translation model, Cross Attention Augmented Transducer (CAAT), which extends conventional RNN-T to sequence-to-sequence tasks.
Experiments on speech-to-text (S2T) and text-to-text (T2T) simultaneous translation tasks shows CAAT achieves better quality-latency trade-offs.
- Score: 36.95800637790494
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes USTC-NELSLIP's submissions to the IWSLT2021 Simultaneous
Speech Translation task. We proposed a novel simultaneous translation model,
Cross Attention Augmented Transducer (CAAT), which extends conventional RNN-T
to sequence-to-sequence tasks without monotonic constraints, e.g., simultaneous
translation. Experiments on speech-to-text (S2T) and text-to-text (T2T)
simultaneous translation tasks shows CAAT achieves better quality-latency
trade-offs compared to \textit{wait-k}, one of the previous state-of-the-art
approaches. Based on CAAT architecture and data augmentation, we build S2T and
T2T simultaneous translation systems in this evaluation campaign. Compared to
last year's optimal systems, our S2T simultaneous translation system improves
by an average of 11.3 BLEU for all latency regimes, and our T2T simultaneous
translation system improves by an average of 4.6 BLEU.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.