InjectRBP: Steering Large Language Model Reasoning Behavior via Pattern Injection
- URL: http://arxiv.org/abs/2602.12013v1
- Date: Thu, 12 Feb 2026 14:44:40 GMT
- Title: InjectRBP: Steering Large Language Model Reasoning Behavior via Pattern Injection
- Authors: Xiuping Wu, Zhao Yu, Yuxin Cheng, Ngai Wong, Liangjun Ke, Tapas Mishra, Konstantinos V. Katsikopoulos,
- Abstract summary: Reasoning can significantly enhance the performance of Large Language Models.<n>We investigate how models' reasoning behaviors shape reasoning from the perspective of behavioral patterns.<n>We propose two optimization methods that require no parameter updates: InjectCorrect and InjectRLOpt.
- Score: 12.65760977924031
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reasoning can significantly enhance the performance of Large Language Models. While recent studies have exploited behavior-related prompts adjustment to enhance reasoning, these designs remain largely intuitive and lack a systematic analysis of the underlying behavioral patterns. Motivated by this, we investigate how models' reasoning behaviors shape reasoning from the perspective of behavioral patterns. We observe that models exhibit adaptive distributions of reasoning behaviors when responding to specific types of questions, and that structurally injecting these patterns can substantially influence the quality of the models' reasoning processes and outcomes. Building on these findings, we propose two optimization methods that require no parameter updates: InjectCorrect and InjectRLOpt. InjectCorrect guides the model by imitating behavioral patterns derived from its own past correct answers. InjectRLOpt learns a value function from historical behavior-pattern data and, via our proposed Reliability-Aware Softmax Policy, generates behavioral injectant during inference to steer the reasoning process. Our experiments demonstrate that both methods can improve model performance across various reasoning tasks without requiring any modifications to model parameters, achieving gains of up to 5.34% and 8.67%, respectively.
Related papers
- Learning Structured Reasoning via Tractable Trajectory Control [99.75278337895024]
Ctrl-R is a framework for learning structured reasoning via tractable trajectory control.<n>We show that Ctrl-R enables effective exploration and internalization of previously unattainable reasoning patterns.
arXiv Detail & Related papers (2026-03-02T09:18:19Z) - Understanding Reasoning in Thinking Language Models via Steering Vectors [29.101045366810652]
We analyze and manipulate specific reasoning behaviors in DeepSeek-R1-Distill models.<n>We demonstrate that these behaviors are mediated by linear directions in the model's activation space and can be controlled using steering vectors.<n>Our approach offers practical tools for steering reasoning processes in thinking models in a controlled and interpretable manner.
arXiv Detail & Related papers (2025-06-22T20:45:26Z) - Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors [61.92704516732144]
We show that the most robust features for correctness prediction are those that play a distinctive causal role in the model's behavior.<n>We propose two methods that leverage causal mechanisms to predict the correctness of model outputs.
arXiv Detail & Related papers (2025-05-17T00:31:39Z) - Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability [16.441081996257576]
We propose leveraging reasoning-intensive models to improve less computationally demanding, non-reasoning models.<n>We demonstrate consistent improvements across various benchmarks, underscoring the potential of this approach for advancing the ability of models to answer questions directly.
arXiv Detail & Related papers (2025-04-13T16:26:56Z) - Towards Understanding Distilled Reasoning Models: A Representational Approach [6.563993791037387]
We train a crosscoder on Qwen-series models and their fine-tuned variants.<n>Our results suggest that the crosscoder learns features corresponding to various types of reasoning, including self-reflection and verification.
arXiv Detail & Related papers (2025-03-05T18:40:19Z) - On the Reasoning Capacity of AI Models and How to Quantify It [0.0]
Large Language Models (LLMs) have intensified the debate surrounding the fundamental nature of their reasoning capabilities.<n>While achieving high performance on benchmarks such as GPQA and MMLU, these models exhibit limitations in more complex reasoning tasks.<n>We propose a novel phenomenological approach that goes beyond traditional accuracy metrics to probe the underlying mechanisms of model behavior.
arXiv Detail & Related papers (2025-01-23T16:58:18Z) - Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.<n>Yet their widespread adoption poses challenges regarding data attribution and interpretability.<n>We develop an influence functions framework to address these challenges.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Representation Surgery: Theory and Practice of Affine Steering [72.61363182652853]
Language models often exhibit undesirable behavior, e.g., generating toxic or gender-biased text.<n>One natural (and common) approach to prevent the model from exhibiting undesirable behavior is to steer the model's representations.<n>This paper investigates the formal and empirical properties of steering functions.
arXiv Detail & Related papers (2024-02-15T00:20:30Z) - Inverse Decision Modeling: Learning Interpretable Representations of
Behavior [72.80902932543474]
We develop an expressive, unifying perspective on inverse decision modeling.
We use this to formalize the inverse problem (as a descriptive model)
We illustrate how this structure enables learning (interpretable) representations of (bounded) rationality.
arXiv Detail & Related papers (2023-10-28T05:05:01Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.