Steering Large Reasoning Models towards Concise Reasoning via Flow Matching
- URL: http://arxiv.org/abs/2602.05539v1
- Date: Thu, 05 Feb 2026 10:56:13 GMT
- Title: Steering Large Reasoning Models towards Concise Reasoning via Flow Matching
- Authors: Yawei Li, Benjamin Bergner, Yinghan Zhao, Vihang Prakash Patil, Bei Chen, Cheng Wang,
- Abstract summary: FlowSteer is a nonlinear steering method that learns a complete transformation between the distributions associated with verbose and concise reasoning.<n>Our work demonstrates that modeling the full distributional transport with generative techniques offers a more effective and principled foundation for controlling LRMs.
- Score: 18.79674738541318
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Reasoning Models (LRMs) excel at complex reasoning tasks, but their efficiency is often hampered by overly verbose outputs. Prior steering methods attempt to address this issue by applying a single, global vector to hidden representations -- an approach grounded in the restrictive linear representation hypothesis. In this work, we introduce FlowSteer, a nonlinear steering method that goes beyond uniform linear shifts by learning a complete transformation between the distributions associated with verbose and concise reasoning. This transformation is learned via Flow Matching as a velocity field, enabling precise, input-dependent control over the model's reasoning process. By aligning steered representations with the distribution of concise-reasoning activations, FlowSteer yields more compact reasoning than the linear shifts. Across diverse reasoning benchmarks, FlowSteer demonstrates strong task performance and token efficiency compared to leading inference-time baselines. Our work demonstrates that modeling the full distributional transport with generative techniques offers a more effective and principled foundation for controlling LRMs.
Related papers
- Disentangled Representation Learning via Flow Matching [48.12507436294143]
Disentangled representation learning aims to capture the underlying explanatory factors of observed data.<n>Existing diffusion-based methods encourage factor independence via inductive biases, yet frequently lack strong semantic alignment.<n>We propose a flow matching-based framework for disentangled representation learning, which casts disentanglement as learning factor-conditioned flows in a compact latent space.
arXiv Detail & Related papers (2026-02-05T02:14:36Z) - Guided Verifier: Collaborative Multimodal Reasoning via Dynamic Process Supervision [11.159231524113764]
Reinforcement Learning (RL) has emerged as a pivotal mechanism for enhancing the complex reasoning capabilities of Multimodal Large Language Models (MLLMs)<n>In this paper, we propose the textbfGuided Verifier framework to address these structural limitations.<n>We develop a specialized data synthesis pipeline targeting multimodal hallucinations, constructing textbfCoRe dataset of process-level negatives and textbfCorrect-guide textbfReasoning trajectories to train the guided verifier.
arXiv Detail & Related papers (2026-02-04T07:38:42Z) - Directional Reasoning Injection for Fine-Tuning MLLMs [51.53222423215055]
Multimodal large language models (MLLMs) are rapidly advancing, yet their reasoning ability often lags behind that of strong text-only counterparts.<n>Existing methods to bridge this gap rely on supervised fine-tuning over large-scale multimodal reasoning data or reinforcement learning.<n>We propose Directional Reasoning Injection for Fine-Tuning (DRIFT) to solve this problem.
arXiv Detail & Related papers (2025-10-16T18:06:46Z) - Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching [36.348940136801296]
A novel guidance framework for discrete data is proposed to address this problem.<n>We derive the exact transition rate for the desired distribution given a learned discrete flow matching model.<n>We demonstrate the effectiveness of our proposed guidance on energy-guided simulations and preference alignment on text-to-image generation and multimodal understanding tasks.
arXiv Detail & Related papers (2025-09-26T05:51:31Z) - KV Cache Steering for Controlling Frozen LLMs [80.50365534625438]
cache steering is a lightweight method for implicit steering of language models.<n>We apply cache steering to induce chain-of-thought reasoning in small language models.
arXiv Detail & Related papers (2025-07-11T17:59:36Z) - Adding Conditional Control to Diffusion Models with Reinforcement Learning [68.06591097066811]
Diffusion models are powerful generative models that allow for precise control over the characteristics of the generated samples.<n>While these diffusion models trained on large datasets have achieved success, there is often a need to introduce additional controls in downstream fine-tuning processes.<n>This work presents a novel method based on reinforcement learning (RL) to add such controls using an offline dataset.
arXiv Detail & Related papers (2024-06-17T22:00:26Z) - Optimal Flow Matching: Learning Straight Trajectories in Just One Step [89.37027530300617]
We develop and theoretically justify the novel textbf Optimal Flow Matching (OFM) approach.
It allows recovering the straight OT displacement for the quadratic transport in just one FM step.
The main idea of our approach is the employment of vector field for FM which are parameterized by convex functions.
arXiv Detail & Related papers (2024-03-19T19:44:54Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Flowformer: Linearizing Transformers with Conservation Flows [77.25101425464773]
We linearize Transformers free from specific inductive biases based on the flow network theory.
By respectively conserving the incoming flow of sinks for source competition and the outgoing flow of sources for sink allocation, Flow-Attention inherently generates informative attentions.
arXiv Detail & Related papers (2022-02-13T08:44:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.