Related papers: Linguistic Neuron Overlap Patterns to Facilitate Cross-lingual Transfer on Low-resource Languages

Linguistic Neuron Overlap Patterns to Facilitate Cross-lingual Transfer on Low-resource Languages

URL: http://arxiv.org/abs/2508.17078v2
Date: Tue, 23 Sep 2025 14:02:49 GMT
Title: Linguistic Neuron Overlap Patterns to Facilitate Cross-lingual Transfer on Low-resource Languages
Authors: Yuemei Xu, Kexin Xu, Jian Zhou, Ling Hu, Lin Gui,
Abstract summary: We propose a simple yet effective method, namely BridgeX-ICL, to improve the zero-shot Cross-lingual In-Context Learning.<n>Unlike existing works focusing on language-specific neurons, BridgeX-ICL explores whether sharing neurons can improve cross-lingual performance.<n>We propose an HSIC-based metric to quantify LLMs' internal linguistic spectrum based on overlapping neurons, guiding optimal bridge selection.
Score: 13.053383340899067
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The current Large Language Models (LLMs) face significant challenges in improving their performance on low-resource languages and urgently need data-efficient methods without costly fine-tuning. From the perspective of language-bridge, we propose a simple yet effective method, namely BridgeX-ICL, to improve the zero-shot Cross-lingual In-Context Learning (X-ICL) for low-resource languages. Unlike existing works focusing on language-specific neurons, BridgeX-ICL explores whether sharing neurons can improve cross-lingual performance in LLMs. We construct neuron probe data from the ground-truth MUSE bilingual dictionaries, and define a subset of language overlap neurons accordingly to ensure full activation of these anchored neurons. Subsequently, we propose an HSIC-based metric to quantify LLMs' internal linguistic spectrum based on overlapping neurons, guiding optimal bridge selection. The experiments conducted on 4 cross-lingual tasks and 15 language pairs from 7 diverse families, covering both high-low and moderate-low pairs, validate the effectiveness of BridgeX-ICL and offer empirical insights into the underlying multilingual mechanisms of LLMs. The code is publicly available at https://github.com/xuyuemei/BridgeX-ICL.

Related papers

Unveiling the Influence of Amplifying Language-Specific Neurons [11.19692440351977]
Language-specific neurons that strongly correlate with individual languages have been shown to influence model behavior by deactivating them.<n>This work investigates the effect of amplifying language-specific neurons through interventions across 18 languages.
arXiv Detail & Related papers (2025-07-30T11:23:30Z)
From Neurons to Semantics: Evaluating Cross-Linguistic Alignment Capabilities of Large Language Models via Neurons Alignment [8.071522960337298]
Existing alignment benchmarks primarily focus on sentence embeddings.<n>Prior research has shown that neural models tend to induce a non-smooth representation space.<n>Inspired by neuroscientific findings that similar information activates overlapping neuronal regions, we propose a novel Neuron State-Based Cross-Lingual Alignment.
arXiv Detail & Related papers (2025-07-20T10:23:22Z)
The Emergence of Abstract Thought in Large Language Models Beyond Any Language [95.50197866832772]
Large language models (LLMs) function effectively across a diverse range of languages.<n>Preliminary studies observe that the hidden activations of LLMs often resemble English, even when responding to non-English prompts.<n>Recent results show strong multilingual performance, even surpassing English performance on specific tasks in other languages.
arXiv Detail & Related papers (2025-06-11T16:00:54Z)
How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective [64.79894853375478]
We propose a new finer-grained neuron identification algorithm, which detects language neurons(including language-specific neurons and language-related neurons) and language-agnostic neurons.<n>Based on the distributional characteristics of different types of neurons, we divide the LLMs' internal process for multilingual inference into four parts.<n>We systematically analyze the models before and after alignment with a focus on different types of neurons.
arXiv Detail & Related papers (2025-05-27T17:59:52Z)
Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models [55.14276067678253]
This paper introduces a novel methodology for efficiently identifying inherent cross-lingual weaknesses in Large Language Models (LLMs)<n>We construct a new dataset of over 6,000 bilingual pairs across 16 languages using this methodology, demonstrating its effectiveness in revealing weaknesses even in state-of-the-art models.<n>Further experiments investigate the relationship between linguistic similarity and cross-lingual weaknesses, revealing that linguistically related languages share similar performance patterns.
arXiv Detail & Related papers (2025-05-24T12:31:27Z)
Language-specific Neurons Do Not Facilitate Cross-Lingual Transfer [21.205821852762362]
Existing techniques to identify language-specific neurons can be leveraged to enhance cross-lingual task performance of lowresource languages.<n>We find that such neuron-specific interventions are insufficient to yield cross-lingual improvements on downstream tasks.
arXiv Detail & Related papers (2025-03-21T18:08:11Z)
Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.<n>Currently, instruction-tuned large language models (LLMs) excel at various English tasks.<n>Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z)
Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages. We classify neurons into four distinct categories based on their responses to a specific input across different languages. Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z)
Cross-Lingual Transfer Robustness to Lower-Resource Languages on Adversarial Datasets [4.653113033432781]
Cross-lingual transfer capabilities of Multilingual Language Models (MLLMs) are investigated. Our research provides valuable insights into cross-lingual transfer and its implications for NLP applications.
arXiv Detail & Related papers (2024-03-29T08:47:15Z)
How do Large Language Models Handle Multilingualism? [81.15060972112563]
This study explores how large language models (LLMs) handle multilingualism. LLMs initially understand the query, converting multilingual inputs into English for task-solving. In the intermediate layers, they employ English for thinking and incorporate multilingual knowledge with self-attention and feed-forward structures.
arXiv Detail & Related papers (2024-02-29T02:55:26Z)
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora. We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs. Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.