Assessing Translation capabilities of Large Language Models involving
English and Indian Languages
- URL: http://arxiv.org/abs/2311.09216v1
- Date: Wed, 15 Nov 2023 18:58:19 GMT
- Title: Assessing Translation capabilities of Large Language Models involving
English and Indian Languages
- Authors: Vandan Mujadia, Ashok Urlana, Yash Bhaskar, Penumalla Aditya Pavani,
Kukkapalli Shravya, Parameswari Krishnamurthy and Dipti Misra Sharma
- Abstract summary: We explore the multilingual capabilities of large language models by using machine translation as a task involving English and 22 Indian languages.
We fine-tune these large language models using parameter efficient fine-tuning methods such as LoRA and additionally with full fine-tuning.
Our results demonstrate significant progress, with average BLEU scores of 13.42, 15.93, 12.13, 12.30, and 12.07, as well as CHRF scores of 43.98, 46.99, 42.55, 42.42, and 45.39, respectively.
- Score: 4.067706269490143
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative Large Language Models (LLMs) have achieved remarkable advancements
in various NLP tasks. In this work, our aim is to explore the multilingual
capabilities of large language models by using machine translation as a task
involving English and 22 Indian languages. We first investigate the translation
capabilities of raw large language models, followed by exploring the in-context
learning capabilities of the same raw models. We fine-tune these large language
models using parameter efficient fine-tuning methods such as LoRA and
additionally with full fine-tuning. Through our study, we have identified the
best performing large language model for the translation task involving LLMs,
which is based on LLaMA.
Our results demonstrate significant progress, with average BLEU scores of
13.42, 15.93, 12.13, 12.30, and 12.07, as well as CHRF scores of 43.98, 46.99,
42.55, 42.42, and 45.39, respectively, using 2-stage fine-tuned LLaMA-13b for
English to Indian languages on IN22 (conversational), IN22 (general),
flores200-dev, flores200-devtest, and newstest2019 testsets. Similarly, for
Indian languages to English, we achieved average BLEU scores of 14.03, 16.65,
16.17, 15.35 and 12.55 along with chrF scores of 36.71, 40.44, 40.26, 39.51,
and 36.20, respectively, using fine-tuned LLaMA-13b on IN22 (conversational),
IN22 (general), flores200-dev, flores200-devtest, and newstest2019 testsets.
Overall, our findings highlight the potential and strength of large language
models for machine translation capabilities, including for languages that are
currently underrepresented in LLMs.
Related papers
- Machine Translation Advancements of Low-Resource Indian Languages by Transfer Learning [9.373815852241648]
We employ two distinct knowledge transfer strategies to develop a reliable machine translation system for low-resource Indian languages.
For Assamese(as) and Manipuri(mn), we fine-tuned the existing IndicTrans2 open-source model to enable bidirectional translation between English and these languages.
For Khasi (kh) and Mizo (mz), We trained a multilingual model as a baseline using bilingual data from these four language pairs, along with an additional about 8kw English-Bengali bilingual data.
arXiv Detail & Related papers (2024-09-24T08:53:19Z) - Paramanu: A Family of Novel Efficient Generative Foundation Language Models for Indian Languages [3.9018931027384056]
We present "Paramanu", a family of novel language models (LM) for Indian languages.
It covers 10 languages (Assamese, Bangla, Hindi, Konkani, Maithili, Marathi, Odia, Sanskrit, Tamil, Telugu) across 5 scripts.
The models are pretrained on a single GPU with context size of 1024 and vary in size from 13.29 million (M) to 367.5 M parameters.
arXiv Detail & Related papers (2024-01-31T17:58:10Z) - Baichuan 2: Open Large-scale Language Models [51.56361715162972]
We present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens.
Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval.
arXiv Detail & Related papers (2023-09-19T04:13:22Z) - Extrapolating Large Language Models to Non-English by Aligning Languages [109.09051737966178]
Existing large language models show disparate capability across different languages.
In this paper, we empower pre-trained LLMs on non-English languages by building semantic alignment across languages.
arXiv Detail & Related papers (2023-08-09T13:32:06Z) - PolyLM: An Open Source Polyglot Large Language Model [57.64420154135178]
We present PolyLM, a multilingual large language model (LLMs) trained on 640 billion (B) tokens, avaliable in two model sizes: 1.7B and 13B.
To enhance its multilingual capabilities, we 1) integrate bilingual data into training data; and 2) adopt a curriculum learning strategy that increases the proportion of non-English data from 30% in the first stage to 60% in the final stage during pre-training.
Further, we propose a multilingual self-instruct method which automatically generates 132.7K diverse multilingual instructions for model fine-tuning.
arXiv Detail & Related papers (2023-07-12T09:00:37Z) - Investigating the Translation Performance of a Large Multilingual
Language Model: the Case of BLOOM [8.858671209228536]
We focus on BLOOM's multilingual ability by evaluating its machine translation performance across several datasets.
We study several aspects including prompt design, model sizes, cross-lingual transfer and the use of discursive context.
arXiv Detail & Related papers (2023-03-03T13:23:42Z) - Massively Multilingual Shallow Fusion with Large Language Models [62.76735265311028]
We train a single multilingual language model (LM) for shallow fusion in multiple languages.
Compared to a dense LM of similar computation during inference, GLaM reduces the WER of an English long-tail test set by 4.4% relative.
In a multilingual shallow fusion task, GLaM improves 41 out of 50 languages with an average relative WER reduction of 3.85%, and a maximum reduction of 10%.
arXiv Detail & Related papers (2023-02-17T14:46:38Z) - SERENGETI: Massively Multilingual Language Models for Africa [5.945320097465418]
We develop SERENGETI, a massively multilingual language model that covers 517 African languages and language varieties.
We evaluate our novel models on eight natural language understanding tasks across 20 datasets, comparing to 4 mPLMs that cover 4-23 African languages.
arXiv Detail & Related papers (2022-12-21T05:54:14Z) - Crosslingual Generalization through Multitask Finetuning [80.8822603322471]
Multitask prompted finetuning (MTF) has been shown to help large language models generalize to new tasks in a zero-shot setting.
We apply MTF to the pretrained multilingual BLOOM and mT5 model families to produce finetuned variants called BLOOMZ and mT0.
We find finetuning large multilingual language models on English tasks with English prompts allows for task generalization to non-English languages.
arXiv Detail & Related papers (2022-11-03T13:19:32Z) - IndicSUPERB: A Speech Processing Universal Performance Benchmark for
Indian languages [16.121708272597154]
We release the IndicSUPERB benchmark for speech recognition in 12 Indian languages.
We train and evaluate different self-supervised models alongside a commonly used baseline benchmark.
We show that language-specific fine-tuned models are more accurate than baseline on most of the tasks.
arXiv Detail & Related papers (2022-08-24T20:14:52Z) - Few-shot Learning with Multilingual Language Models [66.49496434282564]
We train multilingual autoregressive language models on a balanced corpus covering a diverse set of languages.
Our largest model sets new state of the art in few-shot learning in more than 20 representative languages.
We present a detailed analysis of where the model succeeds and fails, showing in particular that it enables cross-lingual in-context learning.
arXiv Detail & Related papers (2021-12-20T16:52:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.