UPC's Speech Translation System for IWSLT 2021
- URL: http://arxiv.org/abs/2105.04512v1
- Date: Mon, 10 May 2021 17:04:11 GMT
- Title: UPC's Speech Translation System for IWSLT 2021
- Authors: Gerard I. G\'allego, Ioannis Tsiamas, Carlos Escolano, Jos\'e A. R.
Fonollosa, Marta R. Costa-juss\`a
- Abstract summary: This paper describes the submission to the IWSLT 2021 offline speech translation task by the UPC Machine Translation group.
The task consists of building a system capable of translating English audio recordings extracted from TED talks into German text.
Our submission is an end-to-end speech translation system, which combines pre-trained models with coupling modules between the encoder and decoder.
- Score: 2.099922236065961
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes the submission to the IWSLT 2021 offline speech
translation task by the UPC Machine Translation group. The task consists of
building a system capable of translating English audio recordings extracted
from TED talks into German text. Submitted systems can be either cascade or
end-to-end and use a custom or given segmentation. Our submission is an
end-to-end speech translation system, which combines pre-trained models
(Wav2Vec 2.0 and mBART) with coupling modules between the encoder and decoder,
and uses an efficient fine-tuning technique, which trains only 20% of its total
parameters. We show that adding an Adapter to the system and pre-training it,
can increase the convergence speed and the final result, with which we achieve
a BLEU score of 27.3 on the MuST-C test set. Our final model is an ensemble
that obtains 28.22 BLEU score on the same set. Our submission also uses a
custom segmentation algorithm that employs pre-trained Wav2Vec 2.0 for
identifying periods of untranscribable text and can bring improvements of 2.5
to 3 BLEU score on the IWSLT 2019 test set, as compared to the result with the
given segmentation.
Related papers
- CMU's IWSLT 2024 Simultaneous Speech Translation System [80.15755988907506]
This paper describes CMU's submission to the IWSLT 2024 Simultaneous Speech Translation (SST) task for translating English speech to German text in a streaming manner.
Our end-to-end speech-to-text (ST) system integrates the WavLM speech encoder, a modality adapter, and the Llama2-7B-Base model as the decoder.
arXiv Detail & Related papers (2024-08-14T10:44:51Z) - KIT's Multilingual Speech Translation System for IWSLT 2023 [58.5152569458259]
We describe our speech translation system for the multilingual track of IWSLT 2023.
The task requires translation into 10 languages of varying amounts of resources.
Our cascaded speech system substantially outperforms its end-to-end counterpart on scientific talk translation.
arXiv Detail & Related papers (2023-06-08T16:13:20Z) - Speech Translation with Foundation Models and Optimal Transport: UPC at
IWSLT23 [0.0]
This paper describes the submission of the UPC Machine Translation group to the IWSLT 2023 Offline Speech Translation task.
Our Speech Translation systems utilize foundation models for speech (wav2vec 2.0) and text (mBART50)
We incorporate a Siamese pretraining step of the speech and text encoders with CTC and Optimal Transport, to adapt the speech representations to the space of the text model.
arXiv Detail & Related papers (2023-06-02T07:48:37Z) - ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
Translation [79.66359274050885]
We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models.
Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks.
arXiv Detail & Related papers (2023-05-24T07:42:15Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline
Shared Task [92.5087402621697]
This paper describes the submission of our end-to-end YiTrans speech translation system for the IWSLT 2022 offline task.
The YiTrans system is built on large-scale pre-trained encoder-decoder models.
Our final submissions rank first on English-German and English-Chinese end-to-end systems in terms of the automatic evaluation metric.
arXiv Detail & Related papers (2022-06-12T16:13:01Z) - The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline
Task [23.008938777422767]
This paper describes the submission of the NiuTrans end-to-end speech translation system for the IWSLT 2021 offline task.
We use the Transformer-based model architecture and enhance it by Conformer, relative position encoding, and stacked acoustic and textual encoding.
We achieve 33.84 BLEU points on the MuST-C En-De test set, which shows the enormous potential of the end-to-end model.
arXiv Detail & Related papers (2021-07-06T07:45:23Z) - ESPnet-ST IWSLT 2021 Offline Speech Translation System [56.83606198051871]
This paper describes the ESPnet-ST group's IWSLT 2021 submission in the offline speech translation track.
This year we made various efforts on training data, architecture, and audio segmentation.
Our best E2E system combined all the techniques with model ensembling and achieved 31.4 BLEU.
arXiv Detail & Related papers (2021-07-01T17:49:43Z) - The Volctrans Neural Speech Translation System for IWSLT 2021 [26.058205594318405]
This paper describes the systems submitted to IWSLT 2021 by the Volctrans team.
For offline speech translation, our best end-to-end model achieves 8.1 BLEU improvements over the benchmark.
For text-to-text simultaneous translation, we explore the best practice to optimize the wait-k model.
arXiv Detail & Related papers (2021-05-16T00:11:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.