Related papers: Efficient Reasoning Models: A Survey

Efficient Reasoning Models: A Survey

URL: http://arxiv.org/abs/2504.10903v1
Date: Tue, 15 Apr 2025 06:28:00 GMT
Title: Efficient Reasoning Models: A Survey
Authors: Sicheng Feng, Gongfan Fang, Xinyin Ma, Xinchao Wang,
Abstract summary: This survey aims to provide a comprehensive overview of recent advances in efficient reasoning.<n>It categorizes existing works into three key directions: (1) shorter - compressing lengthy CoTs into concise yet effective reasoning chains; (2) smaller - developing compact language models with strong reasoning capabilities; and (3) faster.
Score: 52.96232442322824
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reasoning models have demonstrated remarkable progress in solving complex and logic-intensive tasks by generating extended Chain-of-Thoughts (CoTs) prior to arriving at a final answer. Yet, the emergence of this "slow-thinking" paradigm, with numerous tokens generated in sequence, inevitably introduces substantial computational overhead. To this end, it highlights an urgent need for effective acceleration. This survey aims to provide a comprehensive overview of recent advances in efficient reasoning. It categorizes existing works into three key directions: (1) shorter - compressing lengthy CoTs into concise yet effective reasoning chains; (2) smaller - developing compact language models with strong reasoning capabilities through techniques such as knowledge distillation, other model compression techniques, and reinforcement learning; and (3) faster - designing efficient decoding strategies to accelerate inference. A curated collection of papers discussed in this survey is available in our GitHub repository.

Related papers

Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models [49.598776427454176]
Large Reasoning Models (LRMs) have gradually become a research hotspot due to their outstanding performance in handling complex tasks.<n>However, with the widespread application of these models, the problem of overthinking has gradually emerged.<n>Various efficient reasoning methods have been proposed, aiming to reduce the length of reasoning paths without compromising model performance and reasoning capability.
arXiv Detail & Related papers (2025-08-04T06:54:31Z)
Adaptive Deep Reasoning: Triggering Deep Thinking When Needed [28.575411507835973]
Large language models (LLMs) have shown impressive capabilities in handling complex tasks through long-chain reasoning.<n>We propose a novel approach that autonomously switches between short and long-chain reasoning chains based on problem complexity.<n>This advancement enhances the practicality of reasoning in large language models for real-world applications.
arXiv Detail & Related papers (2025-05-26T15:08:51Z)
Fractured Chain-of-Thought Reasoning [61.647243580650446]
We introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling.<n>We show that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget.
arXiv Detail & Related papers (2025-05-19T11:30:41Z)
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models [54.04678363287392]
Large Language Models (LLMs) have demonstrated remarkable capabilities in complex tasks.<n>Recent advancements in OpenAI o1 and DeepSeek-R1 have further improved performance in System-2 reasoning domains.
arXiv Detail & Related papers (2025-03-20T17:59:38Z)
Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation [22.875285119636235]
R1 distillation scheme has emerged as a promising approach for training cost-effective models with enhanced reasoning abilities. This study examines the universality of distillation data and identifies key components that enable the efficient transfer of long-chain reasoning capabilities. We propose DLCoT (Deconstructing Long Chain-of-Thought), a distillation data enhancement framework.
arXiv Detail & Related papers (2025-03-20T17:46:38Z)
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models [39.781889862599854]
Long chain-of-thought (Long CoT) characteristics enhance reasoning abilities and enable the solution of intricate problems.<n>We first distinguish Long CoT from Short CoT and introduce a novel taxonomy to categorize current reasoning paradigms.<n>We then investigate key phenomena such as the emergence of Long CoT with these characteristics, including overthinking, and test-time scaling.
arXiv Detail & Related papers (2025-03-12T17:35:03Z)
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching [60.04718679054704]
Chain-of-Thought prompting elicits step-by-step problem solving, but often at the cost of excessive verbosity in intermediate outputs.<n>We propose Sketch-of-Thought (SoT), a prompting framework that integrates cognitively inspired reasoning paradigms with linguistic constraints.<n>SoT achieves token reductions of up to 78% with minimal accuracy loss across 15 reasoning datasets.
arXiv Detail & Related papers (2025-03-07T06:57:17Z)
How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach [4.055489363682199]
We conduct the first systematic study of the relationship between reasoning length and model performance.<n>We show that this tradeoff persists across even very distinct reasoning chains.<n>We show that prompt-based compression strategies operate far from theoretical limits.
arXiv Detail & Related papers (2025-03-03T03:48:20Z)
Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models [56.37421741507468]
Chain-of-Thought (CoT) reasoning has significantly enhanced the performance of large language models (LLMs) We propose a method to identify critical reasoning steps using perplexity as a measure of their importance.
arXiv Detail & Related papers (2025-02-18T20:04:51Z)
Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems [92.89673285398521]
o1-like reasoning systems have demonstrated remarkable capabilities in solving complex reasoning tasks. We introduce an imitate, explore, and self-improve'' framework to train the reasoning model. Our approach achieves competitive performance compared to industry-level reasoning systems.
arXiv Detail & Related papers (2024-12-12T16:20:36Z)
Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding [1.3479499607624648]
Speculative decoding addresses bottleneck by introducing a two-stage framework: drafting and verification. A smaller, efficient model generates a preliminary draft, which is then refined by a larger, more sophisticated model. This paper provides a comprehensive survey of speculative decoding methods, categorizing them into draft-centric and model-centric approaches.
arXiv Detail & Related papers (2024-11-20T09:46:30Z)
Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought [61.588465852846646]
Chain-of-Thought (CoT) reasoning has emerged as a promising approach for enhancing the performance of large language models (LLMs) In this work, we introduce a novel reasoning boundary framework (RBF) to address these challenges.
arXiv Detail & Related papers (2024-10-08T05:26:28Z)
ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting [124.69672273754144]
Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs) Existing CoT approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts. We introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.
arXiv Detail & Related papers (2024-03-21T11:34:26Z)
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding [46.485363806259265]
Speculative Decoding has emerged as a novel decoding paradigm for Large Language Models (LLMs) inference. In each decoding step, this method first drafts several future tokens efficiently and then verifies them in parallel. This paper presents a comprehensive overview and analysis of this promising decoding paradigm.
arXiv Detail & Related papers (2024-01-15T17:26:50Z)
Complexity-Based Prompting for Multi-Step Reasoning [72.0057198610614]
We study the task of prompting large-scale language models to perform multi-step reasoning. A central question is which reasoning examples make the most effective prompts. We propose complexity-based prompting, a simple and effective example selection scheme for multi-step reasoning.
arXiv Detail & Related papers (2022-10-03T05:33:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.