Not All Languages Are Created Equal in LLMs: Improving Multilingual
Capability by Cross-Lingual-Thought Prompting
- URL: http://arxiv.org/abs/2305.07004v2
- Date: Sun, 22 Oct 2023 13:43:27 GMT
- Title: Not All Languages Are Created Equal in LLMs: Improving Multilingual
Capability by Cross-Lingual-Thought Prompting
- Authors: Haoyang Huang, Tianyi Tang, Dongdong Zhang, Wayne Xin Zhao, Ting Song,
Yan Xia, Furu Wei
- Abstract summary: Large language models (LLMs) demonstrate impressive multilingual capability, but their performance varies substantially across different languages.
We introduce a simple yet effective method, called cross-lingual-thought prompting (XLT)
XLT is a generic template prompt that stimulates cross-lingual and logical reasoning skills to enhance task performance across languages.
- Score: 123.16452714740106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) demonstrate impressive multilingual capability,
but their performance varies substantially across different languages. In this
work, we introduce a simple yet effective method, called cross-lingual-thought
prompting (XLT), to systematically improve the multilingual capability of LLMs.
Specifically, XLT is a generic template prompt that stimulates cross-lingual
and logical reasoning skills to enhance task performance across languages. We
conduct comprehensive evaluations on 7 typical benchmarks related to reasoning,
understanding, and generation tasks, covering both high-resource and
low-resource languages. Experimental results show that XLT not only remarkably
enhances the performance of various multilingual tasks but also significantly
reduces the gap between the average performance and the best performance of
each task in different languages. Notably, XLT brings over 10 points of average
improvement in arithmetic reasoning and open-domain question-answering tasks.
Related papers
- Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions Following [51.18383180774354]
We introduce Multi-IF, a new benchmark designed to assess Large Language Models' proficiency in following multi-turn and multilingual instructions.
Our evaluation of 14 state-of-the-art LLMs on Multi-IF reveals that it presents a significantly more challenging task than existing benchmarks.
languages with non-Latin scripts (Hindi, Russian, and Chinese) generally exhibit higher error rates, suggesting potential limitations in the models' multilingual capabilities.
arXiv Detail & Related papers (2024-10-21T00:59:47Z) - Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities.
In this work, we investigate the spontaneous multilingual alignment improvement of LLMs.
We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z) - Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation [25.850573463743352]
Large-scale multilingual Pretrained Language Models (mPLMs) yield impressive performance on cross-language tasks.
Yet significant performance disparities exist across different languages within the same mPLM.
We introduce ALSACE to leverage the learned knowledge from the well-performing languages to guide under-performing ones within the same mPLM.
arXiv Detail & Related papers (2024-04-12T14:19:16Z) - Analysis of Multi-Source Language Training in Cross-Lingual Transfer [6.992785466925966]
Cross-lingual transfer (XLT) methods have contributed to addressing this data scarcity problem.
We show that the use of multiple source languages in XLT-a technique we term Multi-Source Language Training (MSLT)-leads to increased mingling of embedding spaces for different languages.
On the other hand, we discover that using an arbitrary combination of source languages does not always guarantee better performance.
arXiv Detail & Related papers (2024-02-21T06:37:07Z) - Enhancing Multilingual Capabilities of Large Language Models through
Self-Distillation from Resource-Rich Languages [60.162717568496355]
Large language models (LLMs) have been pre-trained on multilingual corpora.
Their performance still lags behind in most languages compared to a few resource-rich languages.
arXiv Detail & Related papers (2024-02-19T15:07:32Z) - Breaking Language Barriers with a LEAP: Learning Strategies for Polyglot
LLMs [5.682384717239095]
Large language models (LLMs) are at the forefront of transforming numerous domains globally.
This paper tackles the imperative challenge of enhancing the multilingual performance of LLMs.
We present novel techniques that unlock the true potential of LLMs in a polyglot landscape.
arXiv Detail & Related papers (2023-05-28T14:48:38Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks.
We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.