Breaking Language Barriers with a LEAP: Learning Strategies for Polyglot
LLMs
- URL: http://arxiv.org/abs/2305.17740v1
- Date: Sun, 28 May 2023 14:48:38 GMT
- Title: Breaking Language Barriers with a LEAP: Learning Strategies for Polyglot
LLMs
- Authors: Akshay Nambi, Vaibhav Balloli, Mercy Ranjit, Tanuja Ganu, Kabir Ahuja,
Sunayana Sitaram, Kalika Bali
- Abstract summary: Large language models (LLMs) are at the forefront of transforming numerous domains globally.
This paper tackles the imperative challenge of enhancing the multilingual performance of LLMs.
We present novel techniques that unlock the true potential of LLMs in a polyglot landscape.
- Score: 5.682384717239095
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) are at the forefront of transforming numerous
domains globally. However, their inclusivity and effectiveness remain limited
for non-Latin scripts and low-resource languages. This paper tackles the
imperative challenge of enhancing the multilingual performance of LLMs,
specifically focusing on Generative models. Through systematic investigation
and evaluation of diverse languages using popular question-answering (QA)
datasets, we present novel techniques that unlock the true potential of LLMs in
a polyglot landscape. Our approach encompasses three key strategies that yield
remarkable improvements in multilingual proficiency. First, by meticulously
optimizing prompts tailored for polyglot LLMs, we unlock their latent
capabilities, resulting in substantial performance boosts across languages.
Second, we introduce a new hybrid approach that synergizes GPT generation with
multilingual embeddings and achieves significant multilingual performance
improvement on critical tasks like QA and retrieval. Finally, to further propel
the performance of polyglot LLMs, we introduce a novel learning algorithm that
dynamically selects the optimal prompt strategy, LLM model, and embeddings per
query. This dynamic adaptation maximizes the efficacy of LLMs across languages,
outperforming best static and random strategies. Our results show substantial
advancements in multilingual understanding and generation across a diverse
range of languages.
Related papers
- Teaching LLMs to Abstain across Languages via Multilingual Feedback [40.84205285309612]
We show that multilingual feedback helps identify knowledge gaps across diverse languages, cultures, and communities.
Extensive experiments demonstrate that our multilingual feedback approach outperforms various strong baselines.
Further analysis reveals that multilingual feedback is both an effective and a more equitable abstain strategy to serve diverse language speakers.
arXiv Detail & Related papers (2024-06-22T21:59:12Z) - Exploring Design Choices for Building Language-Specific LLMs [36.32622880071991]
We study building language-specific language models by adapting monolingual and multilingual models.
We find that the initial performance before the adaptation is not always indicative of the final performance.
The optimal adaptation method is highly language-dependent, and the simplest approach works well across various experimental settings.
arXiv Detail & Related papers (2024-06-20T18:47:43Z) - Towards Truthful Multilingual Large Language Models: Benchmarking and Alignment Strategies [38.3269908062146]
We construct a benchmark for truthfulness evaluation in multilingual scenarios.
We propose Fact-aware Multilingual Selective Synergy (FaMSS) to optimize the data allocation across a large number of languages.
arXiv Detail & Related papers (2024-06-20T15:59:07Z) - Bridging the Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs [15.911445732909849]
Large language models (LLMs) are at the forefront of transforming numerous domains globally.
However, their inclusivity and effectiveness remain limited for non-Latin scripts and low-resource languages.
This paper tackles the imperative challenge of enhancing the multilingual performance of LLMs without extensive training or fine-tuning.
arXiv Detail & Related papers (2024-05-28T16:56:42Z) - Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities.
In this work, we investigate the spontaneous multilingual alignment improvement of LLMs.
We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z) - Analyzing and Adapting Large Language Models for Few-Shot Multilingual
NLU: Are We There Yet? [82.02076369811402]
Supervised fine-tuning (SFT), supervised instruction tuning (SIT) and in-context learning (ICL) are three alternative, de facto standard approaches to few-shot learning.
We present an extensive and systematic comparison of the three approaches, testing them on 6 high- and low-resource languages, three different NLU tasks, and a myriad of language and domain setups.
Our observations show that supervised instruction tuning has the best trade-off between performance and resource requirements.
arXiv Detail & Related papers (2024-03-04T10:48:13Z) - Enhancing Multilingual Capabilities of Large Language Models through
Self-Distillation from Resource-Rich Languages [60.162717568496355]
Large language models (LLMs) have been pre-trained on multilingual corpora.
Their performance still lags behind in most languages compared to a few resource-rich languages.
arXiv Detail & Related papers (2024-02-19T15:07:32Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - Okapi: Instruction-tuned Large Language Models in Multiple Languages
with Reinforcement Learning from Human Feedback [61.83548032416181]
We present Okapi, the first system with instruction-tuned LLMs based on RLHF for multiple languages.
Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research.
arXiv Detail & Related papers (2023-07-29T18:01:46Z) - Not All Languages Are Created Equal in LLMs: Improving Multilingual
Capability by Cross-Lingual-Thought Prompting [123.16452714740106]
Large language models (LLMs) demonstrate impressive multilingual capability, but their performance varies substantially across different languages.
We introduce a simple yet effective method, called cross-lingual-thought prompting (XLT)
XLT is a generic template prompt that stimulates cross-lingual and logical reasoning skills to enhance task performance across languages.
arXiv Detail & Related papers (2023-05-11T17:44:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.