Efficient Reasoning via Chain of Unconscious Thought
- URL: http://arxiv.org/abs/2505.19756v1
- Date: Mon, 26 May 2025 09:34:04 GMT
- Title: Efficient Reasoning via Chain of Unconscious Thought
- Authors: Ruihan Gong, Yue Liu, Wenjie Qu, Mingzhe Du, Yufei He, Yingwei Ma, Yulin Chen, Xiang Liu, Yi Wen, Xinfeng Li, Ruidong Wang, Xinzhong Zhu, Bryan Hooi, Jiaheng Zhang,
- Abstract summary: Large Reasoning Models (LRMs) achieve promising performance but compromise token efficiency due to verbose reasoning processes.<n>We propose a new reasoning paradigm, termed Chain of Unconscious Thought (CoUT), to improve the token efficiency of LRMs.<n>Our work reveals that models may possess beneficial unconscious thought, enabling improved efficiency without sacrificing performance.
- Score: 40.82356218832031
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Reasoning Models (LRMs) achieve promising performance but compromise token efficiency due to verbose reasoning processes. Unconscious Thought Theory (UTT) posits that complex problems can be solved more efficiently through internalized cognitive processes. Inspired by UTT, we propose a new reasoning paradigm, termed Chain of Unconscious Thought (CoUT), to improve the token efficiency of LRMs by guiding them to mimic human unconscious thought and internalize reasoning processes. Concretely, we first prompt the model to internalize the reasoning by thinking in the hidden layer. Then, we design a bag of token-efficient strategies to further help models reduce unnecessary tokens yet preserve the performance. Our work reveals that models may possess beneficial unconscious thought, enabling improved efficiency without sacrificing performance. Extensive experiments demonstrate the effectiveness of CoUT. Remarkably, it surpasses CoT by reducing token usage by 47.62% while maintaining comparable accuracy, as shown in Figure 1. The code of CoUT is available at this link: https://github.com/Rohan-GRH/CoUT
Related papers
- Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model [7.8354921036790275]
Large Reasoning Models (LRMs) excel at solving complex problems but face an overthinking dilemma.<n>When handling simple tasks, they often produce verbose responses overloaded with thinking tokens.<n>These tokens trigger unnecessary high-level reasoning behaviors like reflection and backtracking, reducing efficiency.
arXiv Detail & Related papers (2025-06-30T13:30:33Z) - Is Long-to-Short a Free Lunch? Investigating Inconsistency and Reasoning Efficiency in LRMs [8.359909829007005]
We investigate whether efficient reasoning strategies introduce behavioral inconsistencies in large reasoning models (LRMs)<n>$ICBENCH$ is a benchmark designed to measure inconsistency in LRMs across three dimensions.<n>We find that while larger models generally exhibit greater consistency than smaller ones, they all display widespread "scheming" behaviors.
arXiv Detail & Related papers (2025-06-24T10:25:28Z) - ConciseHint: Boosting Efficient Reasoning via Continuous Concise Hints during Generation [53.149817480019834]
Recent advancements in large reasoning models (LRMs) have achieved notable performance enhancements on complex reasoning tasks by scaling up the generation length by Chain-of-Thought (CoT)<n>We propose a framework dubbed ConciseHint, which continuously encourages the reasoning model to speak concisely by injecting the textual hint during the token generation of the reasoning process.<n>Experiments on the state-of-the-art LRMs, including DeepSeek-R1 and Qwen-3 series, demonstrate that our method can effectively produce concise reasoning processes while maintaining performance well.
arXiv Detail & Related papers (2025-06-23T16:20:44Z) - Exploring and Exploiting the Inherent Efficiency within Large Reasoning Models for Self-Guided Efficiency Enhancement [101.77467538102924]
Large reasoning models (LRMs) exhibit overthinking, which hinders efficiency and inflates inference cost.<n>We propose two lightweight methods to enhance LRM efficiency.<n>First, we introduce Efficiency Steering, a training-free activation steering technique that modulates reasoning behavior via a single direction.<n>Second, we develop Self-Rewarded Efficiency RL, a reinforcement learning framework that dynamically balances task accuracy and brevity.
arXiv Detail & Related papers (2025-06-18T17:18:12Z) - Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency [24.56015832583054]
Explicit self-reflection, signaled by tokens such as "Wait" and "Hmm", is necessary for advanced reasoning.<n>We propose NoWait, a simple yet effective approach that disables explicit self-reflection by suppressing these tokens during inference.
arXiv Detail & Related papers (2025-06-10T01:54:04Z) - Let LLMs Break Free from Overthinking via Self-Braking Tuning [60.08396797526657]
Large reasoning models (LRMs) have significantly enhanced their reasoning capabilities by generating longer chains of thought.<n>This performance gain comes at the cost of a substantial increase in redundant reasoning during the generation process.<n>We propose a novel framework, Self-Braking Tuning (SBT), which tackles overthinking from the perspective of allowing the model to regulate its own reasoning process.
arXiv Detail & Related papers (2025-05-20T16:53:40Z) - Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models [32.49420948390984]
Large reasoning models (LRMs) typically suffer from "overthinking" problems.<n>We propose a simple yet efficient pipeline, ThoughtMani, to enable LRMs to bypass unnecessary intermediate steps.<n>We find that ThoughtMani enhances safety alignment by an average of 10%.
arXiv Detail & Related papers (2025-04-18T11:07:19Z) - Efficient Inference for Large Reasoning Models: A Survey [42.61170621552432]
Large Reasoning Models (LRMs) significantly improve the reasoning ability of Large Language Models (LLMs) by learning to reason.<n>However, their deliberative reasoning process leads to inefficiencies in token usage, memory consumption, and inference time.<n>This survey provides a review of efficient inference methods designed specifically for LRMs, focusing on mitigating token inefficiency while preserving the reasoning quality.
arXiv Detail & Related papers (2025-03-29T13:27:46Z) - Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching [60.04718679054704]
Chain-of-Thought prompting elicits step-by-step problem solving, but often at the cost of excessive verbosity in intermediate outputs.<n>We propose Sketch-of-Thought (SoT), a prompting framework that integrates cognitively inspired reasoning paradigms with linguistic constraints.<n>SoT achieves token reductions of up to 78% with minimal accuracy loss across 15 reasoning datasets.
arXiv Detail & Related papers (2025-03-07T06:57:17Z) - Chain of Draft: Thinking Faster by Writing Less [37.492654173517046]
Chain of Draft (CoD) is a novel paradigm inspired by human cognitive processes.<n>CoD generates minimalistic yet informative intermediate reasoning outputs while solving tasks.
arXiv Detail & Related papers (2025-02-25T19:36:06Z) - LightThinker: Thinking Step-by-Step Compression [53.8069487638972]
We propose LightThinker, a method that enables large language models to dynamically compress intermediate thoughts during reasoning.<n>Inspired by human cognitive processes, LightThinker compresses thought steps into compact representations and discards the original reasoning chains.<n>Experiments show that LightThinker reduces peak memory usage and inference time, while maintaining competitive accuracy.
arXiv Detail & Related papers (2025-02-21T16:57:22Z) - Efficient Reasoning with Hidden Thinking [48.96945580741641]
Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities.<n>We propose $textbfHeima$ (as hidden llama), an efficient reasoning framework that leverages reasoning CoTs at hidden latent space.<n>Heima model achieves higher generation efficiency while maintaining or even better zero-shot task accuracy.
arXiv Detail & Related papers (2025-01-31T15:10:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.