Enabling Zero-shot Multilingual Spoken Language Translation with
Language-Specific Encoders and Decoders
- URL: http://arxiv.org/abs/2011.01097v2
- Date: Wed, 15 Sep 2021 18:42:21 GMT
- Title: Enabling Zero-shot Multilingual Spoken Language Translation with
Language-Specific Encoders and Decoders
- Authors: Carlos Escolano, Marta R. Costa-juss\`a, Jos\'e A. R. Fonollosa,
Carlos Segura
- Abstract summary: Current end-to-end approaches to Spoken Language Translation rely on limited training resources.
Our proposed method extends a MultiNMT architecture based on language-specific encoders-decoders to the task of Multilingual SLT.
- Score: 5.050654565113709
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current end-to-end approaches to Spoken Language Translation (SLT) rely on
limited training resources, especially for multilingual settings. On the other
hand, Multilingual Neural Machine Translation (MultiNMT) approaches rely on
higher-quality and more massive data sets. Our proposed method extends a
MultiNMT architecture based on language-specific encoders-decoders to the task
of Multilingual SLT (MultiSLT). Our method entirely eliminates the dependency
from MultiSLT data and it is able to translate while training only on ASR and
MultiNMT data.
Our experiments on four different languages show that coupling the speech
encoder to the MultiNMT architecture produces similar quality translations
compared to a bilingual baseline ($\pm 0.2$ BLEU) while effectively allowing
for zero-shot MultiSLT. Additionally, we propose using an Adapter module for
coupling the speech inputs. This Adapter module produces consistent
improvements up to +6 BLEU points on the proposed architecture and +1 BLEU
point on the end-to-end baseline.
Related papers
- Language-Aware Multilingual Machine Translation with Self-Supervised
Learning [13.250011906361273]
Multilingual machine translation (MMT) benefits from cross-lingual transfer but is a challenging multitask optimization problem.
Self-supervised learning approaches have shown promise by improving translation performance as complementary tasks to the MMT task.
We propose a novel but simple SSL task, concurrent denoising, that co-trains with the MMT task by concurrently denoising monolingual data on both the encoder and decoder.
arXiv Detail & Related papers (2023-02-10T01:34:24Z) - Multilingual Multimodal Learning with Machine Translated Text [27.7207234512674]
We investigate whether machine translating English multimodal data can be an effective proxy for the lack of readily available multilingual data.
We propose two metrics for automatically removing such translations from the resulting datasets.
In experiments on five tasks across 20 languages in the IGLUE benchmark, we show that translated data can provide a useful signal for multilingual multimodal learning.
arXiv Detail & Related papers (2022-10-24T11:41:20Z) - LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine
Translation [94.33019040320507]
Multimodal Machine Translation (MMT) focuses on enhancing text-only translation with visual features.
Recent advances still struggle to train a separate model for each language pair, which is costly and unaffordable when the number of languages increases.
We propose the Multilingual MMT task by establishing two new Multilingual MMT benchmark datasets covering seven languages.
arXiv Detail & Related papers (2022-10-19T12:21:39Z) - Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural
Machine Translation [74.158365847236]
SixT++ is a strong many-to-English NMT model that supports 100 source languages but is trained once with a parallel dataset from only six source languages.
It significantly outperforms CRISS and m2m-100, two strong multilingual NMT systems, with an average gain of 7.2 and 5.0 BLEU respectively.
arXiv Detail & Related papers (2021-10-16T10:59:39Z) - Breaking Down Multilingual Machine Translation [74.24795388967907]
We show that multilingual training is beneficial to encoders in general, while it only benefits decoders for low-resource languages (LRLs)
Our many-to-one models for high-resource languages and one-to-many models for LRLs outperform the best results reported by Aharoni et al.
arXiv Detail & Related papers (2021-10-15T14:57:12Z) - Multilingual Speech Translation with Unified Transformer: Huawei Noah's
Ark Lab at IWSLT 2021 [33.876412404781846]
This paper describes the system submitted to the IWSLT 2021 Speech Translation (MultiST) task from Huawei Noah's Ark Lab.
We use a unified transformer architecture for our MultiST model, so that the data from different modalities can be exploited to enhance the model's ability.
We apply several training techniques to improve the performance, including multi-task learning, task-level curriculum learning, data augmentation, etc.
arXiv Detail & Related papers (2021-06-01T02:50:49Z) - Improving Target-side Lexical Transfer in Multilingual Neural Machine
Translation [104.10726545151043]
multilingual data has been found more beneficial for NMT models that translate from the LRL to a target language than the ones that translate into the LRLs.
Our experiments show that DecSDE leads to consistent gains of up to 1.8 BLEU on translation from English to four different languages.
arXiv Detail & Related papers (2020-10-04T19:42:40Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.