Language Surgery in Multilingual Large Language Models
- URL: http://arxiv.org/abs/2506.12450v1
- Date: Sat, 14 Jun 2025 11:09:50 GMT
- Title: Language Surgery in Multilingual Large Language Models
- Authors: Joanito Agili Lopo, Muhammad Ravi Shulthan Habibi, Tack Hwa Wong, Muhammad Ilham Ghozali, Fajri Koto, Genta Indra Winata, Peerat Limkonchotiwat, Alham Fikri Aji, Samuel Cahyawijaya,
- Abstract summary: Large Language Models (LLMs) have demonstrated remarkable generalization capabilities across tasks and languages.<n>This paper investigates the naturally emerging representation alignment in LLMs, particularly in the middle layers.<n>We propose Inference-Time Language Control (ITLC) to enable precise cross-lingual language control and mitigate language confusion.
- Score: 32.77326546076424
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have demonstrated remarkable generalization capabilities across tasks and languages, revolutionizing natural language processing. This paper investigates the naturally emerging representation alignment in LLMs, particularly in the middle layers, and its implications for disentangling language-specific and language-agnostic information. We empirically confirm the existence of this alignment, analyze its behavior in comparison to explicitly designed alignment models, and demonstrate its potential for language-specific manipulation without semantic degradation. Building on these findings, we propose Inference-Time Language Control (ITLC), a novel method that leverages latent injection to enable precise cross-lingual language control and mitigate language confusion in LLMs. Our experiments highlight ITLC's strong cross-lingual control capabilities while preserving semantic integrity in target languages. Furthermore, we demonstrate its effectiveness in alleviating the cross-lingual language confusion problem, which persists even in current large-scale LLMs, leading to inconsistent language generation. This work advances our understanding of representation alignment in LLMs and introduces a practical solution for enhancing their cross-lingual performance.
Related papers
- The Emergence of Abstract Thought in Large Language Models Beyond Any Language [95.50197866832772]
Large language models (LLMs) function effectively across a diverse range of languages.<n>Preliminary studies observe that the hidden activations of LLMs often resemble English, even when responding to non-English prompts.<n>Recent results show strong multilingual performance, even surpassing English performance on specific tasks in other languages.
arXiv Detail & Related papers (2025-06-11T16:00:54Z) - Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models [55.14276067678253]
This paper introduces a novel methodology for efficiently identifying inherent cross-lingual weaknesses in Large Language Models (LLMs)<n>We construct a new dataset of over 6,000 bilingual pairs across 16 languages using this methodology, demonstrating its effectiveness in revealing weaknesses even in state-of-the-art models.<n>Further experiments investigate the relationship between linguistic similarity and cross-lingual weaknesses, revealing that linguistically related languages share similar performance patterns.
arXiv Detail & Related papers (2025-05-24T12:31:27Z) - When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners [111.50503126693444]
We show that language-specific ablation consistently boosts multilingual reasoning performance.<n>Compared to post-training, our training-free ablation achieves comparable or superior results with minimal computational overhead.
arXiv Detail & Related papers (2025-05-21T08:35:05Z) - The Hidden Space of Safety: Understanding Preference-Tuned LLMs in Multilingual context [0.9130277390156759]
Alignment tuning has enabled large language models to excel in reasoning, instruction-following, and minimizing harmful generations.<n>Despite their widespread deployment, these models exhibit a monolingual bias, raising concerns about the effectiveness of alignment across languages.<n>Current alignment methods predominantly focus on English, leaving it unclear how alignment mechanism generalizes to multilingual settings.
arXiv Detail & Related papers (2025-04-03T15:46:46Z) - Linguistic Blind Spots of Large Language Models [14.755831733659699]
We study the performance of recent large language models (LLMs) on linguistic annotation tasks.<n>We find that recent LLMs show limited efficacy in addressing linguistic queries and often struggle with linguistically complex inputs.<n>Our results provide insights to inform future advancements in LLM design and development.
arXiv Detail & Related papers (2025-03-25T01:47:13Z) - Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis [5.029635172046762]
Language Confusion is a phenomenon where Large Language Models (LLMs) generate text that is neither in the desired language, nor in a contextually appropriate language.<n>We introduce a novel metric, Language Confusion Entropy, designed to measure and quantify this confusion.
arXiv Detail & Related papers (2024-10-17T05:43:30Z) - Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
We propose Lens, a novel approach to enhance multilingual capabilities in large language models (LLMs)<n>Lens operates on two subspaces: the language-agnostic subspace, where it aligns target languages with the central language to inherit strong semantic representations, and the language-specific subspace, where it separates target and central languages to preserve linguistic specificity.<n>Lens significantly improves multilingual performance while maintaining the model's English proficiency, achieving better results with less computational cost compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z) - Assessing Code Generation with Intermediate Languages [6.999311675957218]
This study explores the utilization of intermediate languages, including various programming languages, natural language solutions, and pseudo-code.
Our findings reveal that intermediate languages generally exhibit greater efficacy in larger models that have not yet achieved state-of-the-art performance.
arXiv Detail & Related papers (2024-07-07T15:35:41Z) - Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts.<n>We find that Llama Instruct and Mistral models exhibit high degrees of language confusion.<n>We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z) - Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities.
In this work, we investigate the spontaneous multilingual alignment improvement of LLMs.
We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.