Efficient Reasoning Models: A Survey
- URL: http://arxiv.org/abs/2504.10903v1
- Date: Tue, 15 Apr 2025 06:28:00 GMT
- Title: Efficient Reasoning Models: A Survey
- Authors: Sicheng Feng, Gongfan Fang, Xinyin Ma, Xinchao Wang,
- Abstract summary: This survey aims to provide a comprehensive overview of recent advances in efficient reasoning.<n>It categorizes existing works into three key directions: (1) shorter - compressing lengthy CoTs into concise yet effective reasoning chains; (2) smaller - developing compact language models with strong reasoning capabilities; and (3) faster.
- Score: 52.96232442322824
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reasoning models have demonstrated remarkable progress in solving complex and logic-intensive tasks by generating extended Chain-of-Thoughts (CoTs) prior to arriving at a final answer. Yet, the emergence of this "slow-thinking" paradigm, with numerous tokens generated in sequence, inevitably introduces substantial computational overhead. To this end, it highlights an urgent need for effective acceleration. This survey aims to provide a comprehensive overview of recent advances in efficient reasoning. It categorizes existing works into three key directions: (1) shorter - compressing lengthy CoTs into concise yet effective reasoning chains; (2) smaller - developing compact language models with strong reasoning capabilities through techniques such as knowledge distillation, other model compression techniques, and reinforcement learning; and (3) faster - designing efficient decoding strategies to accelerate inference. A curated collection of papers discussed in this survey is available in our GitHub repository.
Related papers
- Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models [54.04678363287392]
Large Language Models (LLMs) have demonstrated remarkable capabilities in complex tasks.<n>Recent advancements in OpenAI o1 and DeepSeek-R1 have further improved performance in System-2 reasoning domains.
arXiv Detail & Related papers (2025-03-20T17:59:38Z) - Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation [22.875285119636235]
R1 distillation scheme has emerged as a promising approach for training cost-effective models with enhanced reasoning abilities.
This study examines the universality of distillation data and identifies key components that enable the efficient transfer of long-chain reasoning capabilities.
We propose DLCoT (Deconstructing Long Chain-of-Thought), a distillation data enhancement framework.
arXiv Detail & Related papers (2025-03-20T17:46:38Z) - Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models [39.781889862599854]
Long chain-of-thought (Long CoT) characteristics enhance reasoning abilities and enable the solution of intricate problems.<n>We first distinguish Long CoT from Short CoT and introduce a novel taxonomy to categorize current reasoning paradigms.<n>We then investigate key phenomena such as the emergence of Long CoT with these characteristics, including overthinking, and test-time scaling.
arXiv Detail & Related papers (2025-03-12T17:35:03Z) - How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach [4.055489363682199]
We conduct the first systematic study of the relationship between reasoning length and model performance.<n>We show that this tradeoff persists across even very distinct reasoning chains.<n>We show that prompt-based compression strategies operate far from theoretical limits.
arXiv Detail & Related papers (2025-03-03T03:48:20Z) - Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models [56.37421741507468]
Chain-of-Thought (CoT) reasoning has significantly enhanced the performance of large language models (LLMs)
We propose a method to identify critical reasoning steps using perplexity as a measure of their importance.
arXiv Detail & Related papers (2025-02-18T20:04:51Z) - Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems [92.89673285398521]
o1-like reasoning systems have demonstrated remarkable capabilities in solving complex reasoning tasks.
We introduce an imitate, explore, and self-improve'' framework to train the reasoning model.
Our approach achieves competitive performance compared to industry-level reasoning systems.
arXiv Detail & Related papers (2024-12-12T16:20:36Z) - Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding [1.3479499607624648]
Speculative decoding addresses bottleneck by introducing a two-stage framework: drafting and verification.
A smaller, efficient model generates a preliminary draft, which is then refined by a larger, more sophisticated model.
This paper provides a comprehensive survey of speculative decoding methods, categorizing them into draft-centric and model-centric approaches.
arXiv Detail & Related papers (2024-11-20T09:46:30Z) - Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought [61.588465852846646]
Chain-of-Thought (CoT) reasoning has emerged as a promising approach for enhancing the performance of large language models (LLMs)
In this work, we introduce a novel reasoning boundary framework (RBF) to address these challenges.
arXiv Detail & Related papers (2024-10-08T05:26:28Z) - ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting [124.69672273754144]
Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs)
Existing CoT approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts.
We introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.
arXiv Detail & Related papers (2024-03-21T11:34:26Z) - Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding [46.485363806259265]
Speculative Decoding has emerged as a novel decoding paradigm for Large Language Models (LLMs) inference.
In each decoding step, this method first drafts several future tokens efficiently and then verifies them in parallel.
This paper presents a comprehensive overview and analysis of this promising decoding paradigm.
arXiv Detail & Related papers (2024-01-15T17:26:50Z) - Complexity-Based Prompting for Multi-Step Reasoning [72.0057198610614]
We study the task of prompting large-scale language models to perform multi-step reasoning.
A central question is which reasoning examples make the most effective prompts.
We propose complexity-based prompting, a simple and effective example selection scheme for multi-step reasoning.
arXiv Detail & Related papers (2022-10-03T05:33:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.