English2Gbe: A multilingual machine translation model for {Fon/Ewe}Gbe
- URL: http://arxiv.org/abs/2112.11482v1
- Date: Mon, 13 Dec 2021 10:35:09 GMT
- Title: English2Gbe: A multilingual machine translation model for {Fon/Ewe}Gbe
- Authors: Gilles Hacheme
- Abstract summary: This paper introduces English2Gbe, a multilingual neural machine translation model capable of translating from English to Ewe or Fon.
We show that English2Gbe outperforms bilingual models (English to Ewe and English Fon) and gives state-of-the-art results on the JW300 benchmark for Fon.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Language is an essential factor of emancipation. Unfortunately, most of the
more than 2,000 African languages are low-resourced. The community has recently
used machine translation to revive and strengthen several African languages.
However, the trained models are often bilingual, resulting in a potentially
exponential number of models to train and maintain to cover all possible
translation directions. Additionally, bilingual models do not leverage the
similarity between some of the languages. Consequently, multilingual neural
machine translation (NMT) is gaining considerable interest, especially for
low-resourced languages. Nevertheless, its adoption by the community is still
limited. This paper introduces English2Gbe, a multilingual NMT model capable of
translating from English to Ewe or Fon. Using the BLEU, CHRF, and TER scores
computed with the Sacrebleu (Post, 2018) package for reproducibility, we show
that English2Gbe outperforms bilingual models (English to Ewe and English to
Fon) and gives state-of-the-art results on the JW300 benchmark for Fon
established by Nekoto et al. (2020). We hope this work will contribute to the
massive adoption of Multilingual models inside the community. Our code is made
accessible from Github.
Related papers
- Paramanu: A Family of Novel Efficient Generative Foundation Language Models for Indian Languages [3.9018931027384056]
We present "Paramanu", a family of novel language models (LM) for Indian languages.
It covers 10 languages (Assamese, Bangla, Hindi, Konkani, Maithili, Marathi, Odia, Sanskrit, Tamil, Telugu) across 5 scripts.
The models are pretrained on a single GPU with context size of 1024 and vary in size from 13.29 million (M) to 367.5 M parameters.
arXiv Detail & Related papers (2024-01-31T17:58:10Z) - Investigating the Translation Performance of a Large Multilingual
Language Model: the Case of BLOOM [8.858671209228536]
We focus on BLOOM's multilingual ability by evaluating its machine translation performance across several datasets.
We study several aspects including prompt design, model sizes, cross-lingual transfer and the use of discursive context.
arXiv Detail & Related papers (2023-03-03T13:23:42Z) - Language Models are Multilingual Chain-of-Thought Reasoners [83.37148309771378]
We introduce the Multilingual Grade School Math benchmark, by manually translating 250 grade-school math problems into ten typologically diverse languages.
We find that the ability to solve MGSM problems via chain-of-thought prompting emerges with increasing model scale.
We show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.
arXiv Detail & Related papers (2022-10-06T17:03:34Z) - Language-Family Adapters for Low-Resource Multilingual Neural Machine
Translation [129.99918589405675]
Large multilingual models trained with self-supervision achieve state-of-the-art results in a wide range of natural language processing tasks.
Multilingual fine-tuning improves performance on low-resource languages but requires modifying the entire model and can be prohibitively expensive.
We propose training language-family adapters on top of mBART-50 to facilitate cross-lingual transfer.
arXiv Detail & Related papers (2022-09-30T05:02:42Z) - Learning to translate by learning to communicate [11.43638897327485]
We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems.
In our approach, we embed a multilingual model into an EC image-reference game, in which the model is incentivized to use multilingual generations to accomplish a vision-grounded task.
We present two variants of EC Fine-Tuning (Steinert-Threlkeld et al., 2022), one of which outperforms a backtranslation-only baseline in all four languages investigated.
arXiv Detail & Related papers (2022-07-14T15:58:06Z) - Generalizing Multimodal Pre-training into Multilingual via Language
Acquisition [54.69707237195554]
English-based Vision-Language Pre-training has achieved great success in various downstream tasks.
Some efforts have been taken to generalize this success to non-English languages through Multilingual Vision-Language Pre-training.
We propose a textbfMultitextbfLingual textbfAcquisition (MLA) framework that can easily generalize a monolingual Vision-Language Pre-training model into multilingual.
arXiv Detail & Related papers (2022-05-29T08:53:22Z) - Multilingual Language Model Adaptive Fine-Tuning: A Study on African
Languages [19.067718464786463]
We perform multilingual adaptive fine-tuning (MAFT) on 17 most-resourced African languages and three other high-resource languages widely spoken on the African continent.
To further specialize the multilingual PLM, we removed vocabulary tokens from the embedding layer that corresponds to non-African writing scripts before MAFT.
Our approach is competitive to applying LAFT on individual languages while requiring significantly less disk space.
arXiv Detail & Related papers (2022-04-13T16:13:49Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - Multilingual Translation with Extensible Multilingual Pretraining and
Finetuning [77.33262578776291]
Previous work has demonstrated that machine translation systems can be created by finetuning on bitext.
We show that multilingual translation models can be created through multilingual finetuning.
We demonstrate that pretrained models can be extended to incorporate additional languages without loss of performance.
arXiv Detail & Related papers (2020-08-02T05:36:55Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.