Related papers: Language steering in latent space to mitigate unintended code-switching

Language steering in latent space to mitigate unintended code-switching

URL: http://arxiv.org/abs/2510.13849v1
Date: Sat, 11 Oct 2025 19:49:38 GMT
Title: Language steering in latent space to mitigate unintended code-switching
Authors: Andrey Goncharov, Nikolai Kondusov, Alexey Zaytsev,
Abstract summary: Large Language Models (LLMs) often exhibit unintended code-switching, reducing reliability in downstream tasks.<n>We propose latent-space language steering, a lightweight inference-time method that identifies language directions via PCA on parallel translations.<n>Our approach mitigates code-switching while preserving semantics with negligible computational overhead.
Score: 1.1330938617817454
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multilingual Large Language Models (LLMs) often exhibit unintended code-switching, reducing reliability in downstream tasks. We propose latent-space language steering, a lightweight inference-time method that identifies language directions via PCA on parallel translations and steers token embeddings along these axes to control language identity. Our approach mitigates code-switching while preserving semantics with negligible computational overhead and requires only minimal parallel data for calibration. Empirically, we achieve 95-99\% language classification accuracy using a single principal component and reduce next-token distributional divergence by up to 42% across multiple language pairs on Qwen2.5 and Llama-3.2 models. We further analyze the layer-wise evolution of language representations, revealing that language identity concentrates in final layers with near-perfect linear separability.

Related papers

CLaS-Bench: A Cross-Lingual Alignment and Steering Benchmark [21.574271160875046]
We introduce CLaS-Bench, a benchmark for evaluating language-forcing behavior in large language models (LLMs) across 32 languages.<n>We find that across languages simple residual-based DiffMean method consistently outperforms all other methods.
arXiv Detail & Related papers (2026-01-13T08:42:03Z)
Languages are Modalities: Cross-Lingual Alignment via Encoder Injection [0.8461674097042394]
We present a compute efficient language-as-modality method that conditions an instruction-tuned decoder without changing the tokenizer or retraining the decoder.<n>LLINK substantially improves bilingual retrieval and achieves 81.3% preference over the base model.<n>We find that improvements can be attributed to reduced tokenization inflation and a stronger cross lingual alignment.
arXiv Detail & Related papers (2025-10-31T07:43:21Z)
Evaluating Multilingual and Code-Switched Alignment in LLMs via Synthetic Natural Language Inference [2.172419551358714]
Large language models (LLMs) are increasingly applied in multilingual contexts, yet their capacity for consistent, logically grounded alignment across languages remains underexplored.<n>We present a framework for multilingual natural language inference that generates synthetic, logic-based premise-hypothesis pairs and translates them into a typologically diverse set of languages.<n>Code-switching does not degrade, and can even improve, performance, suggesting that translation-induced lexical variation may serve as a regularization signal.
arXiv Detail & Related papers (2025-08-20T14:30:34Z)
Causal Language Control in Multilingual Transformers via Sparse Feature Steering [7.754609745940422]
We investigate whether sparse autoencoder features can be leveraged to steer the generated language of multilingual language models.<n>We achieve controlled language shifts with up to 90% success, as measured by FastText language classification.<n>Our analysis reveals that language steering is most effective in mid-to-late transformer layers.
arXiv Detail & Related papers (2025-07-17T06:49:16Z)
Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models [56.61984030508691]
We present the first mechanistic interpretability study of language confusion.<n>We show that confusion points (CPs) are central to this phenomenon.<n>We show that editing a small set of critical neurons, identified via comparative analysis with a multilingual-tuned counterpart, substantially mitigates confusion.
arXiv Detail & Related papers (2025-05-22T11:29:17Z)
Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models [49.16690802656554]
We find that Multilingual factual models struggle to provide consistent responses to semantically equivalent prompts in different languages.<n>We propose a linear shortcut method that bypasses computations in the final layers, enhancing both prediction accuracy and cross-lingual consistency.
arXiv Detail & Related papers (2025-04-05T19:43:10Z)
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training [58.696660064190475]
We find that the existence of code-switching, alternating between different languages within a context, is key to multilingual capabilities.<n>To better explore the power of code-switching for language alignment during pre-training, we investigate the strategy of synthetic code-switching.
arXiv Detail & Related papers (2025-04-02T15:09:58Z)
VECO 2.0: Cross-lingual Language Model Pre-training with Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments. Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs. token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z)
Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling [101.74165219364264]
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks. Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages. In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap. Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel
arXiv Detail & Related papers (2022-04-11T15:55:20Z)
Reducing language context confusion for end-to-end code-switching automatic speech recognition [50.89821865949395]
We propose a language-related attention mechanism to reduce multilingual context confusion for the E2E code-switching ASR model. By calculating the respective attention of multiple languages, our method can efficiently transfer language knowledge from rich monolingual data.
arXiv Detail & Related papers (2022-01-28T14:39:29Z)
Inducing Language-Agnostic Multilingual Representations [61.97381112847459]
Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world. We examine three approaches for this: (i) re-aligning the vector spaces of target languages to a pivot source language; (ii) removing language-specific means and variances, which yields better discriminativeness of embeddings as a by-product; and (iii) increasing input similarity across languages by removing morphological contractions and sentence reordering.
arXiv Detail & Related papers (2020-08-20T17:58:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.