Empowering Cross-lingual Abilities of Instruction-tuned Large Language
Models by Translation-following demonstrations
- URL: http://arxiv.org/abs/2308.14186v1
- Date: Sun, 27 Aug 2023 19:22:12 GMT
- Title: Empowering Cross-lingual Abilities of Instruction-tuned Large Language
Models by Translation-following demonstrations
- Authors: Leonardo Ranaldi, Giulia Pucci, Andre Freitas
- Abstract summary: We propose CrossAlpaca, an It-LLM with cross-lingual instruction-following and Translation-following demonstrations.
Our models, tested over six different languages, outperform the It-LLMs tuned on monolingual data.
- Score: 0.8133739801185272
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The language ability of Large Language Models (LLMs) is often unbalanced
towards English because of the imbalance in the distribution of the
pre-training data. This disparity is demanded in further fine-tuning and
affecting the cross-lingual abilities of LLMs. In this paper, we propose to
empower Instructiontuned LLMs (It-LLMs) in languages other than English by
building semantic alignment between them. Hence, we propose CrossAlpaca, an
It-LLM with cross-lingual instruction-following and Translation-following
demonstrations to improve semantic alignment between languages. We validate our
approach on the multilingual Question Answering (QA) benchmarks XQUAD and MLQA
and adapted versions of MMLU and BBH. Our models, tested over six different
languages, outperform the It-LLMs tuned on monolingual data. The final results
show that instruction tuning on non-English data is not enough and that
semantic alignment can be further improved by Translation-following
demonstrations.
Related papers
- Pruning Multilingual Large Language Models for Multilingual Inference [28.36717615166238]
This study explores how to enhance the zero-shot performance of MLLMs in non-English languages.
We first analyze the behavior of MLLMs when performing translation and reveal that there are large magnitude features that play a critical role in the translation process.
arXiv Detail & Related papers (2024-09-25T13:15:50Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities.
In this work, we investigate the spontaneous multilingual alignment improvement of LLMs.
We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z) - The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights [108.40766216456413]
We propose a question alignment framework to bridge the gap between large language models' English and non-English performance.
Experiment results show it can boost multilingual performance across diverse reasoning scenarios, model families, and sizes.
We analyze representation space, generated response and data scales, and reveal how question translation training strengthens language alignment within LLMs.
arXiv Detail & Related papers (2024-05-02T14:49:50Z) - PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning [46.153828074152436]
We propose a pivot language guided generation approach to enhance instruction tuning in lower-resource languages.
It trains the model to first process instructions in the pivot language, and then produce responses in the target language.
Our approach demonstrates a significant improvement in the instruction-following abilities of LLMs by 29% on average.
arXiv Detail & Related papers (2023-11-15T05:28:07Z) - Extrapolating Large Language Models to Non-English by Aligning Languages [109.09051737966178]
Existing large language models show disparate capability across different languages.
In this paper, we empower pre-trained LLMs on non-English languages by building semantic alignment across languages.
arXiv Detail & Related papers (2023-08-09T13:32:06Z) - Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions [68.01449013641532]
Large-scale Pretrained Language Models (LLMs) have shown strong abilities in multilingual translations.
We present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation.
arXiv Detail & Related papers (2023-05-24T12:00:24Z) - InstructAlign: High-and-Low Resource Language Alignment via Continual
Crosslingual Instruction Tuning [66.31509106146605]
Large language models (LLMs) that are tuned with instructions have demonstrated remarkable capabilities in various tasks and languages.
However, their ability to generalize to underrepresented languages is limited due to the scarcity of available data.
We propose InstructAlign which uses continual crosslingual instruction tuning to enable LLMs to align new unseen languages with previously learned high-resource languages.
arXiv Detail & Related papers (2023-05-23T02:51:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.