Ensembling of Distilled Models from Multi-task Teachers for Constrained
Resource Language Pairs
- URL: http://arxiv.org/abs/2111.13284v1
- Date: Fri, 26 Nov 2021 00:54:37 GMT
- Title: Ensembling of Distilled Models from Multi-task Teachers for Constrained
Resource Language Pairs
- Authors: Amr Hendy, Esraa A. Gad, Mohamed Abdelghaffar, Jailan S. ElMosalami,
Mohamed Afify, Ahmed Y. Tawfik, Hany Hassan Awadalla
- Abstract summary: We focus on the three relatively low resource language pairs Bengali to and from Hindi, English to and from Hausa, and Xhosa to and from Zulu.
We train a multilingual model using a multitask objective employing both parallel and monolingual data.
We see around 70% relative gain in BLEU point for English to and from Hausa, and around 25% relative improvements for both Bengali to and from Hindi, and Xhosa to and from Zulu.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes our submission to the constrained track of WMT21 shared
news translation task. We focus on the three relatively low resource language
pairs Bengali to and from Hindi, English to and from Hausa, and Xhosa to and
from Zulu. To overcome the limitation of relatively low parallel data we train
a multilingual model using a multitask objective employing both parallel and
monolingual data. In addition, we augment the data using back translation. We
also train a bilingual model incorporating back translation and knowledge
distillation then combine the two models using sequence-to-sequence mapping. We
see around 70% relative gain in BLEU point for English to and from Hausa, and
around 25% relative improvements for both Bengali to and from Hindi, and Xhosa
to and from Zulu compared to bilingual baselines.
Related papers
- SPRING Lab IITM's submission to Low Resource Indic Language Translation Shared Task [10.268444449457956]
We develop a robust translation model for four low-resource Indic languages: Khasi, Mizo, Manipuri, and Assamese.
Our approach includes a comprehensive pipeline from data collection and preprocessing to training and evaluation.
To address the scarcity of bilingual data, we use back-translation techniques on monolingual datasets for Mizo and Khasi.
arXiv Detail & Related papers (2024-11-01T16:39:03Z) - Cross-Lingual Knowledge Distillation for Answer Sentence Selection in
Low-Resource Languages [90.41827664700847]
We propose Cross-Lingual Knowledge Distillation (CLKD) from a strong English AS2 teacher as a method to train AS2 models for low-resource languages.
To evaluate our method, we introduce 1) Xtr-WikiQA, a translation-based WikiQA dataset for 9 additional languages, and 2) TyDi-AS2, a multilingual AS2 dataset with over 70K questions spanning 8 typologically diverse languages.
arXiv Detail & Related papers (2023-05-25T17:56:04Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Hindi as a Second Language: Improving Visually Grounded Speech with
Semantically Similar Samples [89.16814518860357]
The objective of this work is to explore the learning of visually grounded speech models (VGS) from multilingual perspective.
Our key contribution in this work is to leverage the power of a high-resource language in a bilingual visually grounded speech model to improve the performance of a low-resource language.
arXiv Detail & Related papers (2023-03-30T16:34:10Z) - Multilingual Pre-training with Language and Task Adaptation for
Multilingual Text Style Transfer [14.799109368073548]
We exploit the pre-trained seq2seq model mBART for multilingual text style transfer.
Using machine translated data as well as gold aligned English sentences yields state-of-the-art results.
arXiv Detail & Related papers (2022-03-16T11:27:48Z) - CUNI systems for WMT21: Multilingual Low-Resource Translation for
Indo-European Languages Shared Task [0.0]
We show that using joint model for multiple similar language pairs improves upon translation quality in each pair.
We also demonstrate that chararacter-level bilingual models are competitive for very similar language pairs.
arXiv Detail & Related papers (2021-09-20T08:10:39Z) - Improving Multilingual Neural Machine Translation For Low-Resource
Languages: French-, English- Vietnamese [4.103253352106816]
This paper proposes two simple strategies to address the rare word issue in multilingual MT systems for two low-resource language pairs: French-Vietnamese and English-Vietnamese.
We have shown significant improvements of up to +1.62 and +2.54 BLEU points over the bilingual baseline systems for both language pairs.
arXiv Detail & Related papers (2020-12-16T04:43:43Z) - Translating Similar Languages: Role of Mutual Intelligibility in
Multilingual Transformers [8.9379057739817]
We investigate approaches to translate between similar languages under low resource conditions.
We submit Transformer-based bilingual and multilingual systems for all language pairs.
Our Spanish-Catalan model has the best performance of all the five language pairs.
arXiv Detail & Related papers (2020-11-10T10:58:38Z) - Learning Contextualised Cross-lingual Word Embeddings and Alignments for
Extremely Low-Resource Languages Using Parallel Corpora [63.5286019659504]
We propose a new approach for learning contextualised cross-lingual word embeddings based on a small parallel corpus.
Our method obtains word embeddings via an LSTM encoder-decoder model that simultaneously translates and reconstructs an input sentence.
arXiv Detail & Related papers (2020-10-27T22:24:01Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.