How Good are Commercial Large Language Models on African Languages?
- URL: http://arxiv.org/abs/2305.06530v1
- Date: Thu, 11 May 2023 02:29:53 GMT
- Title: How Good are Commercial Large Language Models on African Languages?
- Authors: Jessica Ojo and Kelechi Ogueji
- Abstract summary: We present a preliminary analysis of commercial large language models on two tasks (machine translation and text classification) across eight African languages.
Our results suggest that commercial language models produce below-par performance on African languages.
In general, our findings present a call-to-action to ensure African languages are well represented in commercial large language models.
- Score: 0.012691047660244334
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in Natural Language Processing (NLP) has led to the
proliferation of large pretrained language models. These models have been shown
to yield good performance, using in-context learning, even on unseen tasks and
languages. They have also been exposed as commercial APIs as a form of
language-model-as-a-service, with great adoption. However, their performance on
African languages is largely unknown. We present a preliminary analysis of
commercial large language models on two tasks (machine translation and text
classification) across eight African languages, spanning different language
families and geographical areas. Our results suggest that commercial language
models produce below-par performance on African languages. We also find that
they perform better on text classification than machine translation. In
general, our findings present a call-to-action to ensure African languages are
well represented in commercial large language models, given their growing
popularity.
Related papers
- IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models [18.260317326787035]
This paper introduces IrokoBench -- a human-translated benchmark dataset for 16 typologically-diverse low-resource African languages.
We use IrokoBench to evaluate zero-shot, few-shot, and translate-test settings(where test sets are translated into English) across 10 open and four proprietary language models.
We observe a significant performance gap between open and proprietary models, with the highest performing open model, Aya-101 only at 58% of the best-performing proprietary model GPT-4o performance.
arXiv Detail & Related papers (2024-06-05T15:23:08Z) - How good are Large Language Models on African Languages? [18.660783984850845]
We present an analysis of four popular large language models (mT0, Aya, LLaMa 2, and GPT-4) on six tasks across 60 African languages.
Our results suggest that all LLMs produce lower performance for African languages, and there is a large gap in performance compared to high-resource languages.
arXiv Detail & Related papers (2023-11-14T08:10:14Z) - Baichuan 2: Open Large-scale Language Models [51.56361715162972]
We present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens.
Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval.
arXiv Detail & Related papers (2023-09-19T04:13:22Z) - Diffusion Language Models Can Perform Many Tasks with Scaling and
Instruction-Finetuning [56.03057119008865]
We show that scaling diffusion language models can effectively make them strong language learners.
We build competent diffusion language models at scale by first acquiring knowledge from massive data.
Experiments show that scaling diffusion language models consistently improves performance across downstream language tasks.
arXiv Detail & Related papers (2023-08-23T16:01:12Z) - Do All Languages Cost the Same? Tokenization in the Era of Commercial
Language Models [68.29126169579132]
API vendors charge their users based on usage, more specifically on the number of tokens'' processed or generated by the underlying language models.
What constitutes a token, however, is training data and model dependent with a large variance in the number of tokens required to convey the same information in different languages.
We conduct a systematic analysis of the cost and utility of OpenAI's language model API on multilingual benchmarks in 22 typologically diverse languages.
arXiv Detail & Related papers (2023-05-23T05:46:45Z) - BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting [50.24676567971536]
The BLOOM model is a large publicly available multilingual language model, but its pretraining was limited to 46 languages.
We apply existing language adaptation strategies to BLOOM and benchmark its zero-shot prompting performance on eight new languages.
We conclude that with sufficient training data language adaptation can generalize well to diverse languages.
arXiv Detail & Related papers (2022-12-19T15:24:45Z) - AfroLM: A Self-Active Learning-based Multilingual Pretrained Language
Model for 23 African Languages [0.021987601456703476]
We present AfroLM, a multilingual language model pretrained from scratch on 23 African languages.
AfroLM is pretrained on a dataset 14x smaller than existing baselines.
It is able to generalize well across various domains.
arXiv Detail & Related papers (2022-11-07T02:15:25Z) - MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity
Recognition [55.95128479289923]
African languages are spoken by over a billion people, but are underrepresented in NLP research and development.
We create the largest human-annotated NER dataset for 20 African languages.
We show that choosing the best transfer language improves zero-shot F1 scores by an average of 14 points.
arXiv Detail & Related papers (2022-10-22T08:53:14Z) - A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models
for African News Translation [25.05948665615943]
We create a new African news corpus covering 16 languages, of which eight languages are not part of any existing evaluation dataset.
We show that the most effective strategy for transferring both to additional languages and to additional domains is to fine-tune large pre-trained models on small quantities of high-quality translation data.
arXiv Detail & Related papers (2022-05-04T12:11:47Z) - Multilingual Language Model Adaptive Fine-Tuning: A Study on African
Languages [19.067718464786463]
We perform multilingual adaptive fine-tuning (MAFT) on 17 most-resourced African languages and three other high-resource languages widely spoken on the African continent.
To further specialize the multilingual PLM, we removed vocabulary tokens from the embedding layer that corresponds to non-African writing scripts before MAFT.
Our approach is competitive to applying LAFT on individual languages while requiring significantly less disk space.
arXiv Detail & Related papers (2022-04-13T16:13:49Z) - AfroMT: Pretraining Strategies and Reproducible Benchmarks for
Translation of 8 African Languages [94.75849612191546]
AfroMT is a standardized, clean, and reproducible machine translation benchmark for eight widely spoken African languages.
We develop a suite of analysis tools for system diagnosis taking into account the unique properties of these languages.
We demonstrate significant improvements when pretraining on 11 languages, with gains of up to 2 BLEU points over strong baselines.
arXiv Detail & Related papers (2021-09-10T07:45:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.