Related papers: OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

URL: http://arxiv.org/abs/2506.02397v1
Date: Tue, 03 Jun 2025 03:31:30 GMT
Title: OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
Authors: Shengjia Zhang, Junjie Wu, Jiawei Chen, Changwang Zhang, Xingyu Lou, Wangchunshu Zhou, Sheng Zhou, Can Wang, Jun Wang,
Abstract summary: OThink-R1 is a method that prunes redundant reasoning steps while preserving logical validity.<n> Experiments across mathematical and question-answering tasks demonstrate that OThink-R1 reduces reasoning redundancy by almost 23% on average.
Score: 33.008513399946914
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Recent advanced large reasoning models (LRMs) leverage extended chain-of-thought (CoT) reasoning to solve complex tasks, achieving state-of-the-art performance. Despite their success, we identify a critical issue: a substantial portion of simple tasks solved by LRMs can also be addressed by non-reasoning LLMs using significantly fewer tokens, indicating the complex reasoning may not always be necessary. To address this, we systematically analyze the reasoning trajectories of LRMs and present a method utilizing identified paradigms and LLM-Judge to classify these trajectories as either Redundant Reasoning or Essential Reasoning. And we introduce OThink-R1, a method that prunes redundant reasoning steps while preserving logical validity. OThink-R1 dynamically employs the non-thinking mode (fast-thinking) for straightforward problems while engaging in deliberate thinking (slow-thinking) for complex problems. Experiments across mathematical and question-answering tasks demonstrate that OThink-R1 reduces reasoning redundancy by almost 23\% on average without compromising accuracy, offering practical guidelines for efficient reasoning models. The code is available at https://github.com/AgenticIR-Lab/OThink-R1.

Related papers

Towards Concise and Adaptive Thinking in Large Reasoning Models: A Survey [8.736170026262279]
Large reasoning models (LRMs) like OpenAI o1 and DeepSeek R1 have demonstrated impressive performance on complex reasoning tasks.<n>These models also face a huge challenge that generating unnecessarily lengthy and redundant reasoning chains.
arXiv Detail & Related papers (2025-07-13T14:51:59Z)
Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models [12.618562275265704]
Recent Large Reasoning Models (LRMs) excel at complex reasoning tasks but often suffer from overthinking.<n>We propose Think-How-to-Think (TH2T), a novel two-stage fine-tuning strategy that progressively inspires LRMs' difficulty cognition and redundancy cognition.
arXiv Detail & Related papers (2025-07-03T14:24:26Z)
From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval [22.35942074715463]
Chain-of-Thought (CoT) prompting enables complex reasoning in large language models (LLMs)<n>We propose State Machine Reasoning (SMR), a transition-based reasoning framework composed of discrete actions.<n> Experiments on the BEIR and BRIGHT benchmarks show that SMR improves retrieval performance (nDCG@10) by 3.4% while reducing token usage by 74.4%.
arXiv Detail & Related papers (2025-05-29T04:04:25Z)
CoThink: Token-Efficient Reasoning via Instruct Models Guiding Reasoning Models [56.40065909544213]
Large language models (LLMs) benefit from increased test-time compute, a phenomenon known as test-time scaling.<n>However, reasoning-optimized models often overthink even simple problems, producing excessively verbose outputs and leading to low token efficiency.<n>We identify two key causes of this verbosity: (1) reinforcement learning reduces the information density of forward reasoning, and (2) backward chain-of thought training encourages redundant and often unnecessary verification steps.
arXiv Detail & Related papers (2025-05-28T06:24:45Z)
Let LLMs Break Free from Overthinking via Self-Braking Tuning [60.08396797526657]
Large reasoning models (LRMs) have significantly enhanced their reasoning capabilities by generating longer chains of thought.<n>This performance gain comes at the cost of a substantial increase in redundant reasoning during the generation process.<n>We propose a novel framework, Self-Braking Tuning (SBT), which tackles overthinking from the perspective of allowing the model to regulate its own reasoning process.
arXiv Detail & Related papers (2025-05-20T16:53:40Z)
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL [19.731871225975926]
Large reasoning models (LRMs) are proficient at generating explicit, step-by-step reasoning sequences before producing final answers.<n>To address this over-thinking problem, we explore how to equip LRMs with adaptive thinking capabilities.<n>We propose AutoThink, a multi-stage reinforcement learning framework that progressively optimize reasoning policies.
arXiv Detail & Related papers (2025-05-16T04:01:57Z)
S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models [13.083179473480705]
Large Reasoning Models (LRMs) have achieved breakthroughs in complex reasoning tasks through explicit chains of thought.<n>Their heavy reliance on system 2 thinking may limit their system 1 thinking capabilities.<n>S1-Bench introduces a suite of simple, diverse, and natural questions to assess LRMs' performance on questions more suitable for system 1.
arXiv Detail & Related papers (2025-04-14T16:13:23Z)
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs [86.79757571440082]
Large language models (LLMs) such as OpenAI's o1 have demonstrated remarkable abilities in complex reasoning tasks.<n>We identify a phenomenon we term underthinking, where o1-like LLMs frequently switch between different reasoning thoughts.<n>We propose a decoding strategy with thought switching penalty TIP that discourages premature transitions between thoughts.
arXiv Detail & Related papers (2025-01-30T18:58:18Z)
Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning [52.83539473110143]
We introduce a novel structure-oriented analysis method to help Large Language Models (LLMs) better understand a question. To further improve the reliability in complex question-answering tasks, we propose a multi-agent reasoning system, Structure-oriented Autonomous Reasoning Agents (SARA) Extensive experiments verify the effectiveness of the proposed reasoning system. Surprisingly, in some cases, the system even surpasses few-shot methods.
arXiv Detail & Related papers (2024-10-18T05:30:33Z)
Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated Reasoning [74.90592233107712]
We propose a Direct-Indirect Reasoning (DIR) method, which considers Direct Reasoning (DR) and Indirect Reasoning (IR) as multiple parallel reasoning paths that are merged to derive the final answer.<n>Our DIR method is simple yet effective and can be straightforwardly integrated with existing variants of CoT methods.
arXiv Detail & Related papers (2024-02-06T03:41:12Z)
LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning [61.7853049843921]
Chain-of-thought (CoT) prompting is a popular in-context learning approach for large language models (LLMs)<n>This paper introduces a new approach named Latent Reasoning Skills (LaRS) that employs unsupervised learning to create a latent space representation of rationales.
arXiv Detail & Related papers (2023-12-07T20:36:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.