Language steering in latent space to mitigate unintended code-switching
- URL: http://arxiv.org/abs/2510.13849v1
- Date: Sat, 11 Oct 2025 19:49:38 GMT
- Title: Language steering in latent space to mitigate unintended code-switching
- Authors: Andrey Goncharov, Nikolai Kondusov, Alexey Zaytsev,
- Abstract summary: Large Language Models (LLMs) often exhibit unintended code-switching, reducing reliability in downstream tasks.<n>We propose latent-space language steering, a lightweight inference-time method that identifies language directions via PCA on parallel translations.<n>Our approach mitigates code-switching while preserving semantics with negligible computational overhead.
- Score: 1.1330938617817454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multilingual Large Language Models (LLMs) often exhibit unintended code-switching, reducing reliability in downstream tasks. We propose latent-space language steering, a lightweight inference-time method that identifies language directions via PCA on parallel translations and steers token embeddings along these axes to control language identity. Our approach mitigates code-switching while preserving semantics with negligible computational overhead and requires only minimal parallel data for calibration. Empirically, we achieve 95-99\% language classification accuracy using a single principal component and reduce next-token distributional divergence by up to 42% across multiple language pairs on Qwen2.5 and Llama-3.2 models. We further analyze the layer-wise evolution of language representations, revealing that language identity concentrates in final layers with near-perfect linear separability.
Related papers
- CLaS-Bench: A Cross-Lingual Alignment and Steering Benchmark [21.574271160875046]
We introduce CLaS-Bench, a benchmark for evaluating language-forcing behavior in large language models (LLMs) across 32 languages.<n>We find that across languages simple residual-based DiffMean method consistently outperforms all other methods.
arXiv Detail & Related papers (2026-01-13T08:42:03Z) - Languages are Modalities: Cross-Lingual Alignment via Encoder Injection [0.8461674097042394]
We present a compute efficient language-as-modality method that conditions an instruction-tuned decoder without changing the tokenizer or retraining the decoder.<n>LLINK substantially improves bilingual retrieval and achieves 81.3% preference over the base model.<n>We find that improvements can be attributed to reduced tokenization inflation and a stronger cross lingual alignment.
arXiv Detail & Related papers (2025-10-31T07:43:21Z) - Evaluating Multilingual and Code-Switched Alignment in LLMs via Synthetic Natural Language Inference [2.172419551358714]
Large language models (LLMs) are increasingly applied in multilingual contexts, yet their capacity for consistent, logically grounded alignment across languages remains underexplored.<n>We present a framework for multilingual natural language inference that generates synthetic, logic-based premise-hypothesis pairs and translates them into a typologically diverse set of languages.<n>Code-switching does not degrade, and can even improve, performance, suggesting that translation-induced lexical variation may serve as a regularization signal.
arXiv Detail & Related papers (2025-08-20T14:30:34Z) - Causal Language Control in Multilingual Transformers via Sparse Feature Steering [7.754609745940422]
We investigate whether sparse autoencoder features can be leveraged to steer the generated language of multilingual language models.<n>We achieve controlled language shifts with up to 90% success, as measured by FastText language classification.<n>Our analysis reveals that language steering is most effective in mid-to-late transformer layers.
arXiv Detail & Related papers (2025-07-17T06:49:16Z) - Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models [56.61984030508691]
We present the first mechanistic interpretability study of language confusion.<n>We show that confusion points (CPs) are central to this phenomenon.<n>We show that editing a small set of critical neurons, identified via comparative analysis with a multilingual-tuned counterpart, substantially mitigates confusion.
arXiv Detail & Related papers (2025-05-22T11:29:17Z) - Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models [49.16690802656554]
We find that Multilingual factual models struggle to provide consistent responses to semantically equivalent prompts in different languages.<n>We propose a linear shortcut method that bypasses computations in the final layers, enhancing both prediction accuracy and cross-lingual consistency.
arXiv Detail & Related papers (2025-04-05T19:43:10Z) - Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training [58.696660064190475]
We find that the existence of code-switching, alternating between different languages within a context, is key to multilingual capabilities.<n>To better explore the power of code-switching for language alignment during pre-training, we investigate the strategy of synthetic code-switching.
arXiv Detail & Related papers (2025-04-02T15:09:58Z) - VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Bridging the Gap between Language Models and Cross-Lingual Sequence
Labeling [101.74165219364264]
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks.
Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages.
In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap.
Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel
arXiv Detail & Related papers (2022-04-11T15:55:20Z) - Reducing language context confusion for end-to-end code-switching
automatic speech recognition [50.89821865949395]
We propose a language-related attention mechanism to reduce multilingual context confusion for the E2E code-switching ASR model.
By calculating the respective attention of multiple languages, our method can efficiently transfer language knowledge from rich monolingual data.
arXiv Detail & Related papers (2022-01-28T14:39:29Z) - Inducing Language-Agnostic Multilingual Representations [61.97381112847459]
Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world.
We examine three approaches for this: (i) re-aligning the vector spaces of target languages to a pivot source language; (ii) removing language-specific means and variances, which yields better discriminativeness of embeddings as a by-product; and (iii) increasing input similarity across languages by removing morphological contractions and sentence reordering.
arXiv Detail & Related papers (2020-08-20T17:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.