Related papers: Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation

Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation

URL: http://arxiv.org/abs/2507.22608v1
Date: Wed, 30 Jul 2025 12:23:39 GMT
Title: Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation
Authors: Daniil Gurgurov, Katharina Trinley, Yusser Al Ghussin, Tanja Baeumel, Josef van Genabith, Simon Ostermann,
Abstract summary: We analyze language-specific neurons in Llama-3.1-8B, Mistral-Nemo-12B, and Aya-Expanse-8B & 32B across 21 typologically diverse languages.<n>We show that these neurons cluster in deeper layers, with non-Latin scripts showing greater specialization.<n>We steer models to deactivate unwanted languages and activate desired ones, outperforming simpler replacement approaches.
Score: 9.2747149495273
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) exhibit strong multilingual abilities, yet the neural mechanisms behind language-specific processing remain unclear. We analyze language-specific neurons in Llama-3.1-8B, Mistral-Nemo-12B, and Aya-Expanse-8B & 32B across 21 typologically diverse languages, identifying neurons that control language behavior. Using the Language Activation Probability Entropy (LAPE) method, we show that these neurons cluster in deeper layers, with non-Latin scripts showing greater specialization. Related languages share overlapping neurons, reflecting internal representations of linguistic proximity. Through language arithmetics, i.e. systematic activation addition and multiplication, we steer models to deactivate unwanted languages and activate desired ones, outperforming simpler replacement approaches. These interventions effectively guide behavior across five multilingual tasks: language forcing, translation, QA, comprehension, and NLI. Manipulation is more successful for high-resource languages, while typological similarity improves effectiveness. We also demonstrate that cross-lingual neuron steering enhances downstream performance and reveal internal "fallback" mechanisms for language selection when neurons are progressively deactivated. Our code is made publicly available at https://github.com/d-gurgurov/Language-Neurons-Manipulation.

Related papers

Unveiling the Influence of Amplifying Language-Specific Neurons [11.19692440351977]
Language-specific neurons that strongly correlate with individual languages have been shown to influence model behavior by deactivating them.<n>This work investigates the effect of amplifying language-specific neurons through interventions across 18 languages.
arXiv Detail & Related papers (2025-07-30T11:23:30Z)
The Emergence of Abstract Thought in Large Language Models Beyond Any Language [95.50197866832772]
Large language models (LLMs) function effectively across a diverse range of languages.<n>Preliminary studies observe that the hidden activations of LLMs often resemble English, even when responding to non-English prompts.<n>Recent results show strong multilingual performance, even surpassing English performance on specific tasks in other languages.
arXiv Detail & Related papers (2025-06-11T16:00:54Z)
How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective [64.79894853375478]
We propose a new finer-grained neuron identification algorithm, which detects language neurons(including language-specific neurons and language-related neurons) and language-agnostic neurons.<n>Based on the distributional characteristics of different types of neurons, we divide the LLMs' internal process for multilingual inference into four parts.<n>We systematically analyze the models before and after alignment with a focus on different types of neurons.
arXiv Detail & Related papers (2025-05-27T17:59:52Z)
Language-specific Neurons Do Not Facilitate Cross-Lingual Transfer [21.205821852762362]
Existing techniques to identify language-specific neurons can be leveraged to enhance cross-lingual task performance of lowresource languages.<n>We find that such neuron-specific interventions are insufficient to yield cross-lingual improvements on downstream tasks.
arXiv Detail & Related papers (2025-03-21T18:08:11Z)
Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages. We classify neurons into four distinct categories based on their responses to a specific input across different languages. Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z)
On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons [37.32174349956148]
We analyze the neuron-level internal behavior of multilingual decoder-based language models (PLMs) We show that language-specific neurons are unique, with a slight overlap ( 5%) between languages. We tamper with less than 1% of the total neurons in each model during inference and demonstrate that tampering with a few language-specific neurons drastically changes the probability of target language occurrence in text generation.
arXiv Detail & Related papers (2024-04-03T03:37:22Z)
Revealing the Parallel Multilingual Learning within Large Language Models [50.098518799536144]
In this study, we reveal an in-context learning capability of multilingual large language models (LLMs)<n>By translating the input to several languages, we provide Parallel Input in Multiple Languages (PiM) to LLMs, which significantly enhances their comprehension abilities.
arXiv Detail & Related papers (2024-03-14T03:33:46Z)
How do Large Language Models Handle Multilingualism? [81.15060972112563]
This study explores how large language models (LLMs) handle multilingualism. LLMs initially understand the query, converting multilingual inputs into English for task-solving. In the intermediate layers, they employ English for thinking and incorporate multilingual knowledge with self-attention and feed-forward structures.
arXiv Detail & Related papers (2024-02-29T02:55:26Z)
Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models [84.86942006830772]
We conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar. We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe.
arXiv Detail & Related papers (2022-05-04T12:22:31Z)
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection [81.85463892070085]
The SIGMORPHON 2020 task on morphological reinflection aims to investigate systems' ability to generalize across typologically distinct languages. Systems were developed using data from 45 languages and just 5 language families, fine-tuned with data from an additional 45 languages and 10 language families (13 in total), and evaluated on all 90 languages.
arXiv Detail & Related papers (2020-06-20T13:24:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.