Prototype-Based Dynamic Steering for Large Language Models
- URL: http://arxiv.org/abs/2510.05498v1
- Date: Tue, 07 Oct 2025 01:34:28 GMT
- Title: Prototype-Based Dynamic Steering for Large Language Models
- Authors: Ceyhun Efe Kayan, Li Zhang,
- Abstract summary: Prototype-Based Dynamic Steering (PDS) is a test-time method that amplifies large language model (LLM) reasoning without adding or altering instructions.<n>We introduce "reasoning prototypes" by clustering activation differences between Chain-of-Thought (CoT) and neutral prompts.<n>PDS consistently improves accuracy without fine-tuning or prompt engineering.
- Score: 3.90727941420584
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Despite impressive breadth, LLMs still rely on explicit reasoning instructions or static, one-fits-all steering methods, leaving a gap for adaptive, instruction-free reasoning amplification. We present Prototype-Based Dynamic Steering (PDS), a test-time method that amplifies large language model (LLM) reasoning without adding or altering instructions. We introduce "reasoning prototypes" by clustering activation differences between Chain-of-Thought (CoT) and neutral prompts. At inference, an input's hidden state is projected onto these prototypes to form an instance-specific steering vector. Evaluated on GSM8K, AQuA-RAT, and BIG-Bench tasks, PDS consistently improves accuracy without fine-tuning or prompt engineering. Notably, the gains persist even when CoT is explicitly suppressed to improve cost-efficiency, indicating that the intervention strengthens latent reasoning processes rather than inducing a superficial behavioral shift. These results position dynamic, prototype-guided steering as a lightweight alternative to training-time approaches for enhancing LLM reasoning.
Related papers
- AMPS: Adaptive Modality Preference Steering via Functional Entropy [66.69992693275061]
We introduce an instance-aware diagnostic metric that quantifies each modality's information contribution and reveals sample-specific susceptibility to steering.<n> Experimental results show that our instance-aware steering outperforms conventional steering in modulating modality preference.
arXiv Detail & Related papers (2026-02-13T02:29:06Z) - Activation Steering for Masked Diffusion Language Models [1.0980666029958932]
Masked diffusion language models generate text through an iterative denoising process.<n>We present an activation-steering framework for MDLMs that computes layer-wise steering vectors from a single forward pass.<n>Experiments on LLaDA-8B-Instruct demonstrate reliable modulation of high-level attributes.
arXiv Detail & Related papers (2025-12-30T11:10:52Z) - Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs [49.66344956133349]
Reasoning capacity shapes both inference-time performance and reinforcement learning (RL) training for large (vision-) language models.<n>This paper proposes Reasoning Palette, a novel latent-modulation framework that endows the model with a latent variable for strategic contextualization.
arXiv Detail & Related papers (2025-12-19T03:32:53Z) - In-Distribution Steering: Balancing Control and Coherence in Language Model Generation [0.0815557531820863]
We introduce In-Distribution Steering (IDS), a novel method that adapts steering strength based on the input data distribution in representation space.<n>IDS achieves strong accuracy on classification tasks while producing coherent text without collapse, making IDS particularly well suited for real-world applications.
arXiv Detail & Related papers (2025-10-15T08:31:37Z) - SSPO: Self-traced Step-wise Preference Optimization for Process Supervision and Reasoning Compression [15.87106741558898]
Post-training methods incur substantial computational overhead due to auxiliary models and overthinking.<n>We propose Self-traced Step-wise Preference Optimization (SSPO), a plug RLgable process supervision framework.<n>SSPO uses step-wise preference signals generated by the model itself to guide the optimization process for reasoning compression.
arXiv Detail & Related papers (2025-08-18T04:02:15Z) - KV Cache Steering for Controlling Frozen LLMs [80.50365534625438]
cache steering is a lightweight method for implicit steering of language models.<n>We apply cache steering to induce chain-of-thought reasoning in small language models.
arXiv Detail & Related papers (2025-07-11T17:59:36Z) - Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute [60.151643048803145]
We propose Fractional Reasoning, a framework that enables continuous control over reasoning intensity at inference time.<n>Our method operates by extracting the latent steering vector associated with deeper reasoning and reapplying it with a tunable scaling factor.<n> Experiments on GSM8K, MATH500, and GPQA demonstrate that Fractional Reasoning consistently improves performance across diverse reasoning tasks and models.
arXiv Detail & Related papers (2025-06-18T21:15:59Z) - Think Beyond Size: Adaptive Prompting for More Effective Reasoning [0.0]
We introduce Adaptive Prompting, a dynamic and iterative framework designed to enhance reasoning by incorporating real-time adjustments to prompt structures and validation mechanisms.<n>Results demonstrate that Adaptive Prompting significantly improves performance on diverse reasoning benchmarks, including arithmetic reasoning (GSM8K, MultiArithm), logical reasoning and commonsense tasks.<n>Our approach enables smaller models to achieve competitive performance with larger counterparts, such as GPT-4, while maintaining computational efficiency.
arXiv Detail & Related papers (2024-10-10T17:14:36Z) - Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs [63.36637269634553]
We introduce a novel approach where LLMs are fine-tuned to generate a sequence of Diverse Chains of Thought (DCoT) within a single inference step.<n>We show that fine-tuning on DCoT improves performance over the CoT baseline across model families and scales.<n>Our work is also significant because both quantitative analyses and manual evaluations reveal the observed gains stem from the models' ability to refine an initial reasoning chain.
arXiv Detail & Related papers (2024-07-03T15:01:18Z) - InferAligner: Inference-Time Alignment for Harmlessness through
Cross-Model Guidance [56.184255657175335]
We develop textbfInferAligner, a novel inference-time alignment method that utilizes cross-model guidance for harmlessness alignment.
Experimental results show that our method can be very effectively applied to domain-specific models in finance, medicine, and mathematics.
It significantly diminishes the Attack Success Rate (ASR) of both harmful instructions and jailbreak attacks, while maintaining almost unchanged performance in downstream tasks.
arXiv Detail & Related papers (2024-01-20T10:41:03Z) - Steering Language Models With Activation Engineering [40.04138190785384]
We introduce activation engineering: the inference-time modification of activations in order to control (or steer) model outputs.
We achieve SOTA on negative-to-positive sentiment shift and detoxification using models including LLaMA-3 and OPT.
ActAdd yields inference-time control over high-level output properties (like topic and sentiment) while preserving performance on off-target tasks.
arXiv Detail & Related papers (2023-08-20T12:21:05Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.