Internal Bias in Reasoning Models leads to Overthinking
- URL: http://arxiv.org/abs/2505.16448v2
- Date: Tue, 27 May 2025 09:44:20 GMT
- Title: Internal Bias in Reasoning Models leads to Overthinking
- Authors: Renfei Dang, Shujian Huang, Jiajun Chen,
- Abstract summary: We show for the first time that overthinking in reasoning models may stem from their internal bias towards input texts.<n>By masking out the original input section, the affect of internal bias can be effectively alleviated and the reasoning length could be reduced by 31%-53%.
- Score: 58.817405319722596
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While current reasoning models possess strong exploratory capabilities, they are often criticized for overthinking due to redundant and unnecessary reflections. In this work, we reveal for the first time that overthinking in reasoning models may stem from their internal bias towards input texts. Upon encountering a reasoning problem, the model immediately forms a preliminary guess about the answer, which we term as an internal bias since it is not derived through actual reasoning. When this guess conflicts with its reasoning result, the model tends to engage in reflection, leading to the waste of computational resources. Through further interpretability experiments, we find that this behavior is largely driven by the model's excessive attention to the input section, which amplifies the influence of internal bias on its decision-making process. Additionally, by masking out the original input section, the affect of internal bias can be effectively alleviated and the reasoning length could be reduced by 31%-53% across different complex reasoning tasks. Notably, in most cases, this approach also leads to improvements in accuracy. These findings demonstrate a causal relationship between internal bias and overthinking.
Related papers
- Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers [36.044522516005884]
We study privacy leakage in the reasoning traces of large reasoning models used as personal agents.<n>We show that reasoning traces frequently contain sensitive user data, which can be extracted via prompt injections or accidentally leak into outputs.<n>We argue that safety efforts must extend to the model's internal thinking, not just its outputs.
arXiv Detail & Related papers (2025-06-18T17:57:01Z) - How Well Can Reasoning Models Identify and Recover from Unhelpful Thoughts? [31.755709525282914]
We investigate how well reasoning models identify and recover from four types of unhelpful thoughts.<n>We show that models are effective at identifying most unhelpful thoughts but struggle to recover from the same thoughts when these are injected into their thinking process.
arXiv Detail & Related papers (2025-06-12T17:59:53Z) - Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models [103.03315678501546]
Extending thinking traces using prompts like "Wait" or "Let me rethink" can improve performance.<n>This raises a natural question: Does thinking more at test-time truly lead to better reasoning?<n>We show a consistent pattern of initial performance improvements from additional thinking followed by a decline, due to "overthinking"
arXiv Detail & Related papers (2025-06-04T17:55:09Z) - A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models [53.18562650350898]
Chain-of-thought (CoT) reasoning enhances performance of large language models.<n>We present the first comprehensive study of CoT faithfulness in large vision-language models.
arXiv Detail & Related papers (2025-05-29T18:55:05Z) - Revisiting Overthinking in Long Chain-of-Thought from the Perspective of Self-Doubt [74.35891434097053]
Reasoning Large Language Models (RLLMs) have demonstrated impressive performance on complex tasks.<n>They often exhibit overthinking -- performing unnecessary reasoning steps even after arriving at the correct answer.<n>We present a quantitative analysis of overthinking from the perspective of self-doubt.<n>We introduce a simple and effective prompting method to reduce the model's over-reliance on input questions.
arXiv Detail & Related papers (2025-05-29T14:30:02Z) - Reasoning Towards Fairness: Mitigating Bias in Language Models through Reasoning-Guided Fine-Tuning [12.559028963968247]
We investigate the crucial relationship between a model's reasoning ability and fairness.<n>We find that larger models with stronger reasoning abilities exhibit substantially lower stereotypical bias.<n>We introduce ReGiFT, a novel approach that extracts structured reasoning traces from advanced reasoning models and infuses them into models that lack such capabilities.
arXiv Detail & Related papers (2025-04-08T03:21:51Z) - Causal Inference Isn't Special: Why It's Just Another Prediction Problem [1.90365714903665]
Causal inference is often portrayed as distinct from predictive modeling.<n>But at its core, causal inference is simply a structured instance of prediction under distribution shift.<n>This perspective reframes causal estimation as a familiar generalization problem.
arXiv Detail & Related papers (2025-04-06T01:37:50Z) - The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks [96.27754404942364]
Large Reasoning Models (LRMs) represent a breakthrough in AI problem-solving capabilities, but their effectiveness in interactive environments can be limited.<n>This paper introduces and analyzes overthinking in LRMs.<n>We observe three recurring patterns: Analysis Paralysis, Rogue Actions, and Premature Disengagement.
arXiv Detail & Related papers (2025-02-12T09:23:26Z) - When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models [15.781930031346105]
Self-reflection enhances performance in TruthfulQA, but adversely affects results in HotpotQA.
We find that self-reflection shows the most benefit when models are less likely to be correct initially, and when overall question difficulty is higher.
Based on our findings, we propose guidelines for decisions on when to implement self-reflection.
arXiv Detail & Related papers (2024-04-14T02:47:32Z) - Bias-Augmented Consistency Training Reduces Biased Reasoning in
Chain-of-Thought [34.99438001331234]
Chain-of-thought prompting misrepresents factors influencing models' behavior.
bias-augmented consistency training trains models to give consistent reasoning across prompts with and without biasing features.
Applying BCT to GPT-3.5-Turbo with one bias reduces the rate of biased reasoning by 86% on held-out tasks.
arXiv Detail & Related papers (2024-03-08T18:41:42Z) - Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning
from Human Feedback [55.78118035358662]
Reinforcement learning from human feedback serves as a crucial bridge, aligning large language models with human and societal values.
We have identified that the reward model often finds shortcuts to bypass its intended objectives.
We propose an innovative solution, applying the Product-of-Experts technique to separate reward modeling from the influence of sequence length.
arXiv Detail & Related papers (2023-10-08T15:14:39Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.