Related papers: Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning

Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning

URL: http://arxiv.org/abs/2506.02867v2
Date: Wed, 04 Jun 2025 15:00:58 GMT
Title: Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning
Authors: Chen Qian, Dongrui Liu, Haochen Wen, Zhen Bai, Yong Liu, Jing Shao,
Abstract summary: Large reasoning models (LRMs) have demonstrated impressive capabilities in complex problem-solving, yet their internal reasoning mechanisms remain poorly understood.<n>We observe an interesting MI peaks phenomenon: the MI at specific generative steps exhibits a sudden and significant increase during LRM's reasoning process.<n>We then demonstrate that these thinking tokens are crucial for LRM's reasoning performance, while other tokens has minimal impacts.
Score: 33.040747962183076
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large reasoning models (LRMs) have demonstrated impressive capabilities in complex problem-solving, yet their internal reasoning mechanisms remain poorly understood. In this paper, we investigate the reasoning trajectories of LRMs from an information-theoretic perspective. By tracking how mutual information (MI) between intermediate representations and the correct answer evolves during LRM reasoning, we observe an interesting MI peaks phenomenon: the MI at specific generative steps exhibits a sudden and significant increase during LRM's reasoning process. We theoretically analyze such phenomenon and show that as MI increases, the probability of model's prediction error decreases. Furthermore, these MI peaks often correspond to tokens expressing reflection or transition, such as ``Hmm'', ``Wait'' and ``Therefore,'' which we term as the thinking tokens. We then demonstrate that these thinking tokens are crucial for LRM's reasoning performance, while other tokens has minimal impacts. Building on these analyses, we propose two simple yet effective methods to improve LRM's reasoning performance, by delicately leveraging these thinking tokens. Overall, our work provides novel insights into the reasoning mechanisms of LRMs and offers practical ways to improve their reasoning capabilities. The code is available at https://github.com/ChnQ/MI-Peaks.

Related papers

Thinking Isn't an Illusion: Overcoming the Limitations of Reasoning Models via Tool Augmentations [11.503915439591735]
Large Reasoning Models (LRMs) are designed to output a step-by-step thinking process before arriving at a final answer to handle complex reasoning tasks.<n>Recent empirical studies suggest that LLMs without explicit reasoning actually outperform LRMs on tasks with low or high complexity.<n>We investigate whether the limitations of LRMs persist when tool augmentations are introduced.
arXiv Detail & Related papers (2025-07-23T17:04:20Z)
Towards Concise and Adaptive Thinking in Large Reasoning Models: A Survey [8.736170026262279]
Large reasoning models (LRMs) like OpenAI o1 and DeepSeek R1 have demonstrated impressive performance on complex reasoning tasks.<n>These models also face a huge challenge that generating unnecessarily lengthy and redundant reasoning chains.
arXiv Detail & Related papers (2025-07-13T14:51:59Z)
Lost at the Beginning of Reasoning [82.18834329384514]
We show that the first reasoning step exerts a disproportionately large influence on the final prediction.<n>We propose an efficient sampling strategy that leverages a reward model to identify and retain high-quality first reasoning steps.<n>We introduce a new benchmark specifically constructed with deliberately flawed first reasoning steps to systematically evaluate model self-correction capabilities.
arXiv Detail & Related papers (2025-06-27T09:53:57Z)
On Reasoning Strength Planning in Large Reasoning Models [50.61816666920207]
We find evidence that LRMs pre-plan the reasoning strengths in their activations even before generation.<n>We then uncover that LRMs encode this reasoning strength through a pre-allocated directional vector embedded in the activations of the model.<n>Our work provides new insights into the internal mechanisms of reasoning in LRMs and offers practical tools for controlling their reasoning behaviors.
arXiv Detail & Related papers (2025-06-10T02:55:13Z)
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity [16.266145641151375]
Large Reasoning Models generate detailed thinking processes before providing answers.<n>We show that LRMs face a complete accuracy collapse beyond certain complexities.<n>We also investigate the reasoning traces in more depth, studying the patterns of explored solutions.
arXiv Detail & Related papers (2025-06-07T22:42:29Z)
Socratic-PRMBench: Benchmarking Process Reward Models with Systematic Reasoning Patterns [79.42805969325036]
Process Reward Models (PRMs) are crucial in complex reasoning and problem-solving tasks.<n>PRMs are required to identify errors under various reasoning patterns during the reasoning process.<n>Existing benchmarks mainly focus on evaluating PRMs with stepwise correctness.<n>We introduce Socratic-PRMBench, a new benchmark to evaluate PRMs systematically under six reasoning patterns.
arXiv Detail & Related papers (2025-05-29T14:26:53Z)
When Can Large Reasoning Models Save Thinking? Mechanistic Analysis of Behavioral Divergence in Reasoning [19.329523111916682]
Large reasoning models (LRMs) have significantly advanced performance on complex tasks, yet their tendency to overthink introduces inefficiencies.<n>This study investigates the internal mechanisms of reinforcement learning (RL)-trained LRMs when prompted to save thinking.
arXiv Detail & Related papers (2025-05-21T08:55:35Z)
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions [42.77077835885798]
Reasoning capabilities of large reasoning models (LRMs) have seen significant advancements through the slow thinking process.<n>In contrast, small reasoning models (SRMs), often distilled from larger ones, offer greater efficiency and can exhibit distinct capabilities.
arXiv Detail & Related papers (2025-04-12T06:45:57Z)
Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models [55.46269953415811]
We identify ToM-sensitive parameters and show that perturbing as little as 0.001% of these parameters significantly degrades ToM performance.<n>Our results have implications for enhancing model alignment, mitigating biases, and improving AI systems designed for human interaction.
arXiv Detail & Related papers (2025-04-05T17:45:42Z)
Process or Result? Manipulated Ending Tokens Can Mislead Reasoning LLMs to Ignore the Correct Reasoning Steps [39.759594479826454]
We explore how vulnerable are reasoning models to subtle errors in their input reasoning chains.<n>We introduce "Compromising Thought" (CPT), a vulnerability where models presented with reasoning tokens containing manipulated calculation results tend to ignore correct reasoning steps and adopt incorrect results instead.<n>Our work enhances understanding of reasoning robustness and highlights security considerations for reasoning-intensive applications.
arXiv Detail & Related papers (2025-03-25T03:43:11Z)
Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilities [101.77467538102924]
Recent advancements in Large Reasoning Models (LRMs) have demonstrated remarkable performance in specialized reasoning tasks.<n>We show that acquiring deliberative reasoning capabilities significantly reduces the foundational capabilities of LRMs.<n>We demonstrate that adaptive reasoning -- employing modes like Zero-Thinking, Less-Thinking, and Summary-Thinking -- can effectively alleviate these drawbacks.
arXiv Detail & Related papers (2025-03-23T08:18:51Z)
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.<n>We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.<n>Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z)
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation [110.71955853831707]
We view LMs as deriving new conclusions by aggregating indirect reasoning paths seen at pre-training time. We formalize the reasoning paths as random walk paths on the knowledge/reasoning graphs. Experiments and analysis on multiple KG and CoT datasets reveal the effect of training on random walk paths.
arXiv Detail & Related papers (2024-02-05T18:25:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.