FST: the FAIR Speech Translation System for the IWSLT21 Multilingual
Shared Task
- URL: http://arxiv.org/abs/2107.06959v1
- Date: Wed, 14 Jul 2021 19:43:44 GMT
- Title: FST: the FAIR Speech Translation System for the IWSLT21 Multilingual
Shared Task
- Authors: Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger
Schwenk, Naman Goyal
- Abstract summary: We describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign.
Our system is built by leveraging transfer learning across modalities, tasks and languages.
- Score: 36.51221186190272
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we describe our end-to-end multilingual speech translation
system submitted to the IWSLT 2021 evaluation campaign on the Multilingual
Speech Translation shared task. Our system is built by leveraging transfer
learning across modalities, tasks and languages. First, we leverage
general-purpose multilingual modules pretrained with large amounts of
unlabelled and labelled data. We further enable knowledge transfer from the
text task to the speech task by training two tasks jointly. Finally, our
multilingual model is finetuned on speech translation task-specific data to
achieve the best translation results. Experimental results show our system
outperforms the reported systems, including both end-to-end and cascaded based
approaches, by a large margin.
In some translation directions, our speech translation results evaluated on
the public Multilingual TEDx test set are even comparable with the ones from a
strong text-to-text translation system, which uses the oracle speech
transcripts as input.
Related papers
- Towards a Deep Understanding of Multilingual End-to-End Speech
Translation [52.26739715012842]
We analyze representations learnt in a multilingual end-to-end speech translation model trained over 22 languages.
We derive three major findings from our analysis.
arXiv Detail & Related papers (2023-10-31T13:50:55Z) - KIT's Multilingual Speech Translation System for IWSLT 2023 [58.5152569458259]
We describe our speech translation system for the multilingual track of IWSLT 2023.
The task requires translation into 10 languages of varying amounts of resources.
Our cascaded speech system substantially outperforms its end-to-end counterpart on scientific talk translation.
arXiv Detail & Related papers (2023-06-08T16:13:20Z) - ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
Translation [79.66359274050885]
We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models.
Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks.
arXiv Detail & Related papers (2023-05-24T07:42:15Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Back-translation for Large-Scale Multilingual Machine Translation [2.8747398859585376]
This paper aims to build a single multilingual translation system with a hypothesis that a universal cross-language representation leads to better multilingual translation performance.
We extend the exploration of different back-translation methods from bilingual translation to multilingual translation.
Surprisingly, the smaller size of vocabularies perform better, and the extensive monolingual English data offers a modest improvement.
arXiv Detail & Related papers (2021-09-17T18:33:15Z) - ViTA: Visual-Linguistic Translation by Aligning Object Tags [7.817598216459955]
Multimodal Machine Translation (MMT) enriches the source text with visual information for translation.
We propose our system for the Multimodal Translation Task of WAT 2021 from English to Hindi.
arXiv Detail & Related papers (2021-06-01T06:19:29Z) - Multilingual Speech Translation with Unified Transformer: Huawei Noah's
Ark Lab at IWSLT 2021 [33.876412404781846]
This paper describes the system submitted to the IWSLT 2021 Speech Translation (MultiST) task from Huawei Noah's Ark Lab.
We use a unified transformer architecture for our MultiST model, so that the data from different modalities can be exploited to enhance the model's ability.
We apply several training techniques to improve the performance, including multi-task learning, task-level curriculum learning, data augmentation, etc.
arXiv Detail & Related papers (2021-06-01T02:50:49Z) - MCL@IITK at SemEval-2021 Task 2: Multilingual and Cross-lingual
Word-in-Context Disambiguation using Augmented Data, Signals, and
Transformers [1.869621561196521]
We present our approach for solving the SemEval 2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC)
The goal is to detect whether a given word common to both the sentences evokes the same meaning.
We submit systems for both the settings - Multilingual and Cross-Lingual.
arXiv Detail & Related papers (2021-04-04T08:49:28Z) - Self-Supervised Representations Improve End-to-End Speech Translation [57.641761472372814]
We show that self-supervised pre-trained features can consistently improve the translation performance.
Cross-lingual transfer allows to extend to a variety of languages without or with little tuning.
arXiv Detail & Related papers (2020-06-22T10:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.