Related papers: AfriNLLB: Efficient Translation Models for African Languages

AfriNLLB: Efficient Translation Models for African Languages

URL: http://arxiv.org/abs/2602.09373v1
Date: Tue, 10 Feb 2026 03:23:59 GMT
Title: AfriNLLB: Efficient Translation Models for African Languages
Authors: Yasmin Moslem, Aman Kassahun Wassie, Amanuel Gizachew Abebe,
Abstract summary: AfriNLLB supports 15 language pairs (30 translation directions), including Swahili, Hausa, Yoruba, Amharic, Somali, Zulu, Lingala, Afrikaans, Wolof, and Egyptian Arabic.<n>Our training data covers bidirectional translation between English and 13 languages, and between French and two languages (Lingala and Wolof)
Score: 0.2637730115670454
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we present AfriNLLB, a series of lightweight models for efficient translation from and into African languages. AfriNLLB supports 15 language pairs (30 translation directions), including Swahili, Hausa, Yoruba, Amharic, Somali, Zulu, Lingala, Afrikaans, Wolof, and Egyptian Arabic, as well as other African Union official languages such as Arabic (MSA), French, Portuguese, and Spanish. Our training data covers bidirectional translation between English and 13 languages, and between French and two languages (Lingala and Wolof). AfriNLLB models are based on NLLB-200 600M, which we compress using iterative layer pruning and quantization. We fine-tune the pruned models on parallel corpora we curated for African languages, employing knowledge distillation from a larger teacher model. Our work aims at enabling efficient deployment of translation models for African languages in resource-constrained settings. Our evaluation results demonstrate that AfriNLLB models achieve performance comparable to the baseline while being significantly faster. We release two versions of the AfriNLLB models, a Transformers version that allows further fine-tuning and a CTranslate2 version for efficient inference. Moreover, we release all the training data that we used for fine-tuning the baseline and pruned models to facilitate further research.

Related papers

Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance [0.0]
We present our latest Hindi-English bi-lingual LLM textbfMantra-14B with 3% average improvement in benchmark scores over both languages.<n>We instruction tuned models such as Qwen-2.5-14B-Instruct and Phi-4 to improve performance over both English and Hindi.<n>Our results indicate that modest fine-tuning with culturally and locally informed data can bridge performance gaps without incurring significant computational overhead.
arXiv Detail & Related papers (2025-04-13T23:10:13Z)
Lugha-Llama: Adapting Large Language Models for African Languages [48.97516583523523]
Large language models (LLMs) have achieved impressive results in a wide range of natural language applications.<n>We consider how to adapt LLMs to low-resource African languages.<n>We find that combining curated data from African languages with high-quality English educational texts results in a training mix that substantially improves the model's performance on these languages.
arXiv Detail & Related papers (2025-04-09T02:25:53Z)
LiLiuM: eBay's Large Language Models for e-commerce [6.819297537500464]
We introduce the LiLiuM series of large language models (LLMs): 1B, 7B, and 13B parameter models developed 100% in-house. This gives eBay full control over all aspects of the models including license, data, vocabulary, and architecture. The LiLiuM LLMs have been trained on 3 trillion tokens of multilingual text from general and e-commerce domain.
arXiv Detail & Related papers (2024-06-17T18:45:41Z)
Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages [51.301942056881146]
We investigate how large language models (LLMs) function as rerankers in cross-lingual information retrieval systems for African languages. Our implementation covers English and four African languages (Hausa, Somali, Swahili, and Yoruba) We examine cross-lingual reranking with queries in English and passages in the African languages.
arXiv Detail & Related papers (2023-12-26T18:38:54Z)
AfroBench: How Good are Large Language Models on African Languages? [55.35674466745322]
AfroBench is a benchmark for evaluating the performance of LLMs across 64 African languages.<n>AfroBench consists of nine natural language understanding datasets, six text generation datasets, six knowledge and question answering tasks, and one mathematical reasoning task.
arXiv Detail & Related papers (2023-11-14T08:10:14Z)
PolyLM: An Open Source Polyglot Large Language Model [57.64420154135178]
We present PolyLM, a multilingual large language model (LLMs) trained on 640 billion (B) tokens, avaliable in two model sizes: 1.7B and 13B. To enhance its multilingual capabilities, we 1) integrate bilingual data into training data; and 2) adopt a curriculum learning strategy that increases the proportion of non-English data from 30% in the first stage to 60% in the final stage during pre-training. Further, we propose a multilingual self-instruct method which automatically generates 132.7K diverse multilingual instructions for model fine-tuning.
arXiv Detail & Related papers (2023-07-12T09:00:37Z)
Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model [92.91310997807936]
NLLB-200 is a set of multilingual Neural Machine Translation models that cover 202 languages. We propose a pruning method that enables the removal of up to 80% of experts without further finetuning.
arXiv Detail & Related papers (2022-12-19T19:29:40Z)
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting [50.24676567971536]
The BLOOM model is a large publicly available multilingual language model, but its pretraining was limited to 46 languages. We apply existing language adaptation strategies to BLOOM and benchmark its zero-shot prompting performance on eight new languages. We conclude that with sufficient training data language adaptation can generalize well to diverse languages.
arXiv Detail & Related papers (2022-12-19T15:24:45Z)
AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages [0.021987601456703476]
We present AfroLM, a multilingual language model pretrained from scratch on 23 African languages. AfroLM is pretrained on a dataset 14x smaller than existing baselines. It is able to generalize well across various domains.
arXiv Detail & Related papers (2022-11-07T02:15:25Z)
Beyond English-Centric Multilingual Machine Translation [74.21727842163068]
We create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages. We build and open source a training dataset that covers thousands of language directions with supervised data, created through large-scale mining. Our focus on non-English-Centric models brings gains of more than 10 BLEU when directly translating between non-English directions while performing competitively to the best single systems of WMT.
arXiv Detail & Related papers (2020-10-21T17:01:23Z)
HausaMT v1.0: Towards English-Hausa Neural Machine Translation [0.012691047660244334]
We build a baseline model for English-Hausa machine translation. The Hausa language is the second largest Afro-Asiatic language in the world after Arabic.
arXiv Detail & Related papers (2020-06-09T02:08:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.