Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic
- URL: http://arxiv.org/abs/2412.04277v1
- Date: Thu, 05 Dec 2024 15:59:29 GMT
- Title: Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic
- Authors: Zaid Alyafeai, Michael Pieler, Hannah Teufel, Jonathan Tow, Marco Bellagente, Duy Phung, Nikhil Pinnaparaju, Reshinth Adithyan, Paulo Rocha, Maksym Zhuravinskyi, Carlos Riquelme,
- Abstract summary: Large Language Models (LLMs) have shown impressive results in multiple domains of natural language processing (NLP)
This paper introduces Arabic Stable LM 1.6B in a base and chat version as a small but powerful Arabic-centric LLM.
Our model achieves impressive results on several benchmarks beating multiple models with up to 8x the parameters.
- Score: 11.027926605631105
- License:
- Abstract: Large Language Models (LLMs) have shown impressive results in multiple domains of natural language processing (NLP) but are mainly focused on the English language. Recently, more LLMs have incorporated a larger proportion of multilingual text to represent low-resource languages. In Arabic NLP, several Arabic-centric LLMs have shown remarkable results on multiple benchmarks in the past two years. However, most Arabic LLMs have more than 7 billion parameters, which increases their hardware requirements and inference latency, when compared to smaller LLMs. This paper introduces Arabic Stable LM 1.6B in a base and chat version as a small but powerful Arabic-centric LLM. Our Arabic Stable LM 1.6B chat model achieves impressive results on several benchmarks beating multiple models with up to 8x the parameters. In addition, we show the benefit of mixing in synthetic instruction tuning data by augmenting our fine-tuning data with a large synthetic dialogue dataset.
Related papers
- AIN: The Arabic INclusive Large Multimodal Model [71.29419186696138]
AIN is an English-Arabic bilingual LMM designed to excel in English and Arabic.
AIN demonstrates state-of-the-art Arabic performance, while also possessing strong English-language visual capabilities.
AIN's superior capabilities position it as a significant step toward empowering Arabic speakers with advanced multimodal generative AI tools.
arXiv Detail & Related papers (2025-01-31T18:58:20Z) - NusaMT-7B: Machine Translation for Low-Resource Indonesian Languages with Large Language Models [2.186901738997927]
This paper introduces NusaMT-7B, an LLM-based machine translation model for low-resource Indonesian languages.
Our approach integrates continued pre-training on monolingual data,Supervised Fine-Tuning (SFT), self-learning, and an LLM-based data cleaner to reduce noise in parallel sentences.
Our results show that fine-tuned LLMs can enhance translation quality for low-resource languages, aiding in linguistic preservation and cross-cultural communication.
arXiv Detail & Related papers (2024-10-10T11:33:25Z) - Quality or Quantity? On Data Scale and Diversity in Adapting Large Language Models for Low-Resource Translation [62.202893186343935]
We explore what it would take to adapt Large Language Models for low-resource languages.
We show that parallel data is critical during both pre-training andSupervised Fine-Tuning (SFT)
Our experiments with three LLMs across two low-resourced language groups reveal consistent trends, underscoring the generalizability of our findings.
arXiv Detail & Related papers (2024-08-23T00:59:38Z) - Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities.
In this work, we investigate the spontaneous multilingual alignment improvement of LLMs.
We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z) - Beyond English: Evaluating LLMs for Arabic Grammatical Error Correction [19.970419667319046]
Large language models (LLMs) finetuned to follow human instruction have exhibited significant capabilities in English NLP tasks.
We evaluate the abilities of instruction finetuned LLMs in Arabic grammatical error correction (GEC)
We find that instruction finetuned models, regardless of their size, are still outperformed by fully finetuned ones, even if they are significantly smaller in size.
arXiv Detail & Related papers (2023-12-13T05:33:25Z) - Analyzing Multilingual Competency of LLMs in Multi-Turn Instruction
Following: A Case Study of Arabic [1.0878040851638]
We employ GPT-4 as a uniform evaluator for both English and Arabic queries to assess and compare the performance of the LLMs on various open-ended tasks.
We find that fine-tuned base models using multilingual and multi-turn datasets could be competitive to models trained from scratch on multilingual data.
arXiv Detail & Related papers (2023-10-23T11:40:04Z) - The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants [80.4837840962273]
We present Belebele, a dataset spanning 122 language variants.
This dataset enables the evaluation of text models in high-, medium-, and low-resource languages.
arXiv Detail & Related papers (2023-08-31T17:43:08Z) - Extrapolating Large Language Models to Non-English by Aligning Languages [109.09051737966178]
Existing large language models show disparate capability across different languages.
In this paper, we empower pre-trained LLMs on non-English languages by building semantic alignment across languages.
arXiv Detail & Related papers (2023-08-09T13:32:06Z) - ChatGPT for Arabic Grammatical Error Correction [5.945320097465418]
Large language models (LLMs) fine-tuned to follow human instruction have exhibited significant capabilities in English NLP tasks.
In this paper, we delve into abilities of instruction fine-tuned LLMs in Arabic GEC, a task made complex due to Arabic's rich morphology.
We find that instruction fine-tuned models, regardless of their size, significantly underperform compared to fully fine-tuned models of significantly smaller sizes.
arXiv Detail & Related papers (2023-08-08T18:00:39Z) - Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis [103.89753784762445]
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT)
This paper systematically investigates the advantages and challenges of LLMs for MMT.
We thoroughly evaluate eight popular LLMs, including ChatGPT and GPT-4.
arXiv Detail & Related papers (2023-04-10T15:51:30Z) - A Large and Diverse Arabic Corpus for Language Modeling [0.0]
This work elaborates on the design and development of a large Arabic corpus.
It consists of over 500 GB of Arabic cleaned text targeted at improving cross-domain knowledge.
In order to evaluate the effectiveness of the LM, a number of typical NLP tasks are fine-tuned.
arXiv Detail & Related papers (2022-01-23T11:17:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.