Related papers: Bridging the Language Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs

Bridging the Language Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs

URL: http://arxiv.org/abs/2305.17740v2
Date: Tue, 07 Jan 2025 04:03:46 GMT
Title: Bridging the Language Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs
Authors: Somnath Kumar, Vaibhav Balloli, Mercy Ranjit, Kabir Ahuja, Sunayana Sitaram, Kalika Bali, Tanuja Ganu, Akshay Nambi,
Abstract summary: Large language models (LLMs) have revolutionized various domains but still struggle with non-Latin scripts and low-resource languages.<n>We introduce a novel dynamic learning approach that optimize prompt strategy, embedding model, and LLM per query at runtime.<n>We show our approach results in 10-15% improvements in multilingual performance over pre-trained models and 4x gains compared to fine-tuned, language-specific models.
Score: 15.911445732909849
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have revolutionized various domains but still struggle with non-Latin scripts and low-resource languages. This paper addresses the critical challenge of improving multilingual performance without extensive fine-tuning. We introduce a novel dynamic learning approach that optimizes prompt strategy, embedding model, and LLM per query at runtime. By adapting configurations dynamically, our method achieves significant improvements over static, best and random baselines. It operates efficiently in both offline and online settings, generalizing seamlessly across new languages and datasets. Leveraging Retrieval-Augmented Generation (RAG) with state-of-the-art multilingual embeddings, we achieve superior task performance across diverse linguistic contexts. Through systematic investigation and evaluation across 18 diverse languages using popular question-answering (QA) datasets we show our approach results in 10-15% improvements in multilingual performance over pre-trained models and 4x gains compared to fine-tuned, language-specific models.

Related papers

Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models [52.22235443948351]
High-quality multilingual training data is essential for effectively pretraining large language models (LLMs)<n>Here, we introduce JQL, a systematic approach that efficiently curates diverse and high-quality multilingual data at scale.<n>JQL distills LLMs' annotation capabilities into lightweight annotators based on pretrained multilingual embeddings.
arXiv Detail & Related papers (2025-05-28T11:06:54Z)
Enhancing Multilingual LLM Pretraining with Model-Based Data Selection [33.68104398807581]
We propose a model-based filtering framework for multilingual datasets. Our approach emphasizes transparency, simplicity, and efficiency. We extend our framework to 20 languages for which we release the refined pretraining datasets.
arXiv Detail & Related papers (2025-02-14T18:42:07Z)
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models [89.13128402847943]
We present LUSIFER, a novel zero-shot approach that adapts LLM-based embedding models for multilingual tasks without requiring multilingual supervision. LUSIFER's architecture combines a multilingual encoder, serving as a language-universal learner, with an LLM-based embedding model optimized for embedding-specific tasks. We introduce a new benchmark encompassing 5 primary embedding tasks, 123 diverse datasets, and coverage across 14 languages.
arXiv Detail & Related papers (2025-01-01T15:43:07Z)
Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention [71.12193680015622]
Large Language Models (LLMs) have shown remarkable capabilities in natural language processing. LLMs exhibit significant performance gaps among different languages. We propose Inference-Time Cross-Lingual Intervention (INCLINE) to overcome these limitations without incurring significant costs.
arXiv Detail & Related papers (2024-10-16T11:23:03Z)
Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs) It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs. It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z)
Multilingual Prompts in LLM-Based Recommenders: Performance Across Languages [0.0]
This work explores the impact of non-English prompts on recommendation performance. Evaluation on three real-world datasets, namely ML1M, LastFM, and Amazon-Beauty, showed that usage of non-English prompts generally reduce performance. Retraining with multilingual prompts resulted in more balanced performance across languages, but slightly reduced English performance.
arXiv Detail & Related papers (2024-09-11T20:31:42Z)
Teaching LLMs to Abstain across Languages via Multilingual Feedback [40.84205285309612]
We show that multilingual feedback helps identify knowledge gaps across diverse languages, cultures, and communities. Extensive experiments demonstrate that our multilingual feedback approach outperforms various strong baselines. Further analysis reveals that multilingual feedback is both an effective and a more equitable abstain strategy to serve diverse language speakers.
arXiv Detail & Related papers (2024-06-22T21:59:12Z)
Exploring Design Choices for Building Language-Specific LLMs [36.32622880071991]
We study building language-specific language models by adapting monolingual and multilingual models. We find that the initial performance of LLM does not always correlate with the final performance after the adaptation.
arXiv Detail & Related papers (2024-06-20T18:47:43Z)
Bridging the Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs [15.911445732909849]
Large language models (LLMs) are at the forefront of transforming numerous domains globally. However, their inclusivity and effectiveness remain limited for non-Latin scripts and low-resource languages. This paper tackles the imperative challenge of enhancing the multilingual performance of LLMs without extensive training or fine-tuning.
arXiv Detail & Related papers (2024-05-28T16:56:42Z)
Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities. In this work, we investigate the spontaneous multilingual alignment improvement of LLMs. We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z)
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages [60.162717568496355]
Large language models (LLMs) have been pre-trained on multilingual corpora. Their performance still lags behind in most languages compared to a few resource-rich languages.
arXiv Detail & Related papers (2024-02-19T15:07:32Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback [61.83548032416181]
We present Okapi, the first system with instruction-tuned LLMs based on RLHF for multiple languages. Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research.
arXiv Detail & Related papers (2023-07-29T18:01:46Z)
Not All Languages Are Created Equal in LLMs: Improving Multilingual Capability by Cross-Lingual-Thought Prompting [123.16452714740106]
Large language models (LLMs) demonstrate impressive multilingual capability, but their performance varies substantially across different languages. We introduce a simple yet effective method, called cross-lingual-thought prompting (XLT) XLT is a generic template prompt that stimulates cross-lingual and logical reasoning skills to enhance task performance across languages.
arXiv Detail & Related papers (2023-05-11T17:44:17Z)
Specializing Multilingual Language Models: An Empirical Study [50.7526245872855]
Contextualized word representations from pretrained multilingual language models have become the de facto standard for addressing natural language tasks. For languages rarely or never seen by these models, directly using such models often results in suboptimal representation or use of data.
arXiv Detail & Related papers (2021-06-16T18:13:55Z)
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks. Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages. We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.