Where Are We? Evaluating LLM Performance on African Languages
- URL: http://arxiv.org/abs/2502.19582v1
- Date: Wed, 26 Feb 2025 21:49:54 GMT
- Title: Where Are We? Evaluating LLM Performance on African Languages
- Authors: Ife Adebara, Hawau Olamide Toyin, Nahom Tesfu Ghebremichael, AbdelRahim Elmadany, Muhammad Abdul-Mageed,
- Abstract summary: Africa's rich linguistic heritage remains underrepresented in NLP.<n>This paper integrates theoretical insights on Africa's language landscape with an empirical evaluation using Sahara.
- Score: 16.206469767073155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Africa's rich linguistic heritage remains underrepresented in NLP, largely due to historical policies that favor foreign languages and create significant data inequities. In this paper, we integrate theoretical insights on Africa's language landscape with an empirical evaluation using Sahara - a comprehensive benchmark curated from large-scale, publicly accessible datasets capturing the continent's linguistic diversity. By systematically assessing the performance of leading large language models (LLMs) on Sahara, we demonstrate how policy-induced data variations directly impact model effectiveness across African languages. Our findings reveal that while a few languages perform reasonably well, many Indigenous languages remain marginalized due to sparse data. Leveraging these insights, we offer actionable recommendations for policy reforms and inclusive data practices. Overall, our work underscores the urgent need for a dual approach - combining theoretical understanding with empirical evaluation - to foster linguistic diversity in AI for African communities.
Related papers
- Natural language processing for African languages [7.884789325654572]
dissertation focuses on languages spoken in Sub-Saharan Africa where all the indigenous languages can be regarded as low-resourced.<n>We show that the quality of semantic representations learned in word embeddings does not only depend on the amount of data but on the quality of pre-training data.<n>We develop large scale human-annotated labelled datasets for 21 African languages in two impactful NLP tasks.
arXiv Detail & Related papers (2025-06-30T22:26:36Z) - Improving Multilingual Math Reasoning for African Languages [49.27985213689457]
We conduct experiments to evaluate different combinations of data types (translated versus synthetically generated), training stages (pre-training versus post-training), and other model adaptation configurations.<n>Our experiments focuses on mathematical reasoning tasks, using the Llama 3.1 model family as our base model.
arXiv Detail & Related papers (2025-05-26T11:35:01Z) - Voice of a Continent: Mapping Africa's Speech Technology Frontier [14.063189144905074]
Africa's rich linguistic diversity remains significantly underrepresented in speech technologies.<n>We introduce the Simba family of models, achieving state-of-the-art performance across multiple African languages and speech tasks.<n>Our work highlights the need for expanded speech technology resources that better reflect Africa's linguistic diversity.
arXiv Detail & Related papers (2025-05-24T00:11:07Z) - Designing and Contextualising Probes for African Languages [3.161415847253143]
This paper presents the first systematic investigation into probing PLMs for linguistic knowledge about African languages.<n>We train layer-wise probes for six typologically diverse African languages to analyse how linguistic features are distributed.<n>We find PLMs adapted for African languages to encode more linguistic information about target languages than massively multilingual PLMs.
arXiv Detail & Related papers (2025-05-15T08:35:14Z) - Lugha-Llama: Adapting Large Language Models for African Languages [48.97516583523523]
Large language models (LLMs) have achieved impressive results in a wide range of natural language applications.
We consider how to adapt LLMs to low-resource African languages.
We find that combining curated data from African languages with high-quality English educational texts results in a training mix that substantially improves the model's performance on these languages.
arXiv Detail & Related papers (2025-04-09T02:25:53Z) - One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks [68.33068005789116]
We present the first study aimed at objectively assessing the fairness and robustness of Large Language Models (LLMs) in handling dialects in canonical reasoning tasks.<n>We hire AAVE speakers, including experts with computer science backgrounds, to rewrite seven popular benchmarks, such as HumanEval and GSM8K.<n>Our findings reveal that textbfalmost all of these widely used models show significant brittleness and unfairness to queries in AAVE.
arXiv Detail & Related papers (2024-10-14T18:44:23Z) - Zero-Shot Cross-Lingual Reranking with Large Language Models for
Low-Resource Languages [51.301942056881146]
We investigate how large language models (LLMs) function as rerankers in cross-lingual information retrieval systems for African languages.
Our implementation covers English and four African languages (Hausa, Somali, Swahili, and Yoruba)
We examine cross-lingual reranking with queries in English and passages in the African languages.
arXiv Detail & Related papers (2023-12-26T18:38:54Z) - AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages [33.05774949324384]
We create high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 African languages.
We also develop AfriCOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder.
arXiv Detail & Related papers (2023-11-16T11:52:52Z) - AfroBench: How Good are Large Language Models on African Languages? [55.35674466745322]
AfroBench is a benchmark for evaluating the performance of LLMs across 64 African languages.<n>AfroBench consists of nine natural language understanding datasets, six text generation datasets, six knowledge and question answering tasks, and one mathematical reasoning task.
arXiv Detail & Related papers (2023-11-14T08:10:14Z) - NusaWrites: Constructing High-Quality Corpora for Underrepresented and
Extremely Low-Resource Languages [54.808217147579036]
We conduct a case study on Indonesian local languages.
We compare the effectiveness of online scraping, human translation, and paragraph writing by native speakers in constructing datasets.
Our findings demonstrate that datasets generated through paragraph writing by native speakers exhibit superior quality in terms of lexical diversity and cultural content.
arXiv Detail & Related papers (2023-09-19T14:42:33Z) - MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity
Recognition [55.95128479289923]
African languages are spoken by over a billion people, but are underrepresented in NLP research and development.
We create the largest human-annotated NER dataset for 20 African languages.
We show that choosing the best transfer language improves zero-shot F1 scores by an average of 14 points.
arXiv Detail & Related papers (2022-10-22T08:53:14Z) - MasakhaNER: Named Entity Recognition for African Languages [48.34339599387944]
We create the first large publicly available high-quality dataset for named entity recognition in ten African languages.
We detail characteristics of the languages to help researchers understand the challenges that these languages pose for NER.
arXiv Detail & Related papers (2021-03-22T13:12:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.