A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law
- URL: http://arxiv.org/abs/2505.02665v2
- Date: Thu, 08 May 2025 05:27:18 GMT
- Title: A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law
- Authors: Qianjun Pan, Wenkai Ji, Yuyang Ding, Junsong Li, Shilian Chen, Junyi Wang, Jie Zhou, Qin Chen, Min Zhang, Yulan Wu, Liang He,
- Abstract summary: This survey explores recent advancements in reasoning large language models (LLMs) designed to mimic "slow thinking"<n>LLMs focus on scaling computational resources dynamically during complex tasks, such as math reasoning, visual reasoning, medical diagnosis, and multi-agent debates.
- Score: 29.763080554625216
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This survey explores recent advancements in reasoning large language models (LLMs) designed to mimic "slow thinking" - a reasoning process inspired by human cognition, as described in Kahneman's Thinking, Fast and Slow. These models, like OpenAI's o1, focus on scaling computational resources dynamically during complex tasks, such as math reasoning, visual reasoning, medical diagnosis, and multi-agent debates. We present the development of reasoning LLMs and list their key technologies. By synthesizing over 100 studies, it charts a path toward LLMs that combine human-like deep thinking with scalable efficiency for reasoning. The review breaks down methods into three categories: (1) test-time scaling dynamically adjusts computation based on task complexity via search and sampling, dynamic verification; (2) reinforced learning refines decision-making through iterative improvement leveraging policy networks, reward models, and self-evolution strategies; and (3) slow-thinking frameworks (e.g., long CoT, hierarchical processes) that structure problem-solving with manageable steps. The survey highlights the challenges and further directions of this domain. Understanding and advancing the reasoning abilities of LLMs is crucial for unlocking their full potential in real-world applications, from scientific discovery to decision support systems.
Related papers
- Chain of Methodologies: Scaling Test Time Computation without Training [77.85633949575046]
Large Language Models (LLMs) often struggle with complex reasoning tasks due to insufficient in-depth insights in their training data.<n>This paper introduces the Chain of the (CoM) framework that enhances structured thinking by integrating human methodological insights.
arXiv Detail & Related papers (2025-06-08T03:46:50Z) - Computational Thinking Reasoning in Large Language Models [69.28428524878885]
Computational Thinking Model (CTM) is a novel framework that incorporates computational thinking paradigms into large language models (LLMs)<n>Live code execution is seamlessly integrated into the reasoning process, allowing CTM to think by computing.<n>CTM outperforms conventional reasoning models and tool-augmented baselines in terms of accuracy, interpretability, and generalizability.
arXiv Detail & Related papers (2025-06-03T09:11:15Z) - Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges [4.668749313973097]
This paper systematically evaluate Large Language Models (LLMs) and Large Reasoning Models (LRMs) across three levels of reasoning complexity.<n>We curate 26 challenges where models answer directly or by Python Code Interpreter.<n>LRMs show robust performance across tasks with various levels of difficulty, often competing or surpassing traditional first-principle-based methods.
arXiv Detail & Related papers (2025-05-16T18:32:35Z) - A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems [93.8285345915925]
Reasoning is a fundamental cognitive process that enables logical inference, problem-solving, and decision-making.<n>With the rapid advancement of large language models (LLMs), reasoning has emerged as a key capability that distinguishes advanced AI systems.<n>We categorize existing methods along two dimensions: (1) Regimes, which define the stage at which reasoning is achieved; and (2) Architectures, which determine the components involved in the reasoning process.
arXiv Detail & Related papers (2025-04-12T01:27:49Z) - A Survey of Scaling in Large Language Model Reasoning [62.92861523305361]
We provide a comprehensive examination of scaling in large Language models (LLMs) reasoning.<n>We analyze scaling in reasoning steps that improves multi-step inference and logical consistency.<n>We discuss scaling in training-enabled reasoning, focusing on optimization through iterative model improvement.
arXiv Detail & Related papers (2025-04-02T23:51:27Z) - Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models [31.556646366268286]
Large Language Models increasingly rely on prolonged reasoning chains to solve complex tasks.<n>This trial-and-error approach often leads to high computational overhead and error propagation.<n>We introduce Meta-Reasoner, a framework that dynamically optimize inference-time reasoning.
arXiv Detail & Related papers (2025-02-27T09:40:13Z) - From System 1 to System 2: A Survey of Reasoning Large Language Models [72.99519859756602]
Foundational Large Language Models excel at fast decision-making but lack depth for complex reasoning.<n>OpenAI's o1/o3 and DeepSeek's R1 have demonstrated expert-level performance in fields such as mathematics and coding.
arXiv Detail & Related papers (2025-02-24T18:50:52Z) - Advancing Reasoning in Large Language Models: Promising Methods and Approaches [0.0]
Large Language Models (LLMs) have succeeded remarkably in various natural language processing (NLP) tasks.<n>Their ability to perform complex reasoning-spanning logical deduction, mathematical problem-solving, commonsense inference, and multi-step reasoning-often falls short of human expectations.<n>This survey provides a comprehensive review of emerging techniques enhancing reasoning in LLMs.
arXiv Detail & Related papers (2025-02-05T23:31:39Z) - Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models [33.13238566815798]
Large Language Models (LLMs) have sparked significant research interest in leveraging them to tackle complex reasoning tasks.<n>Recent studies demonstrate that encouraging LLMs to "think" with more tokens during test-time inference can significantly boost reasoning accuracy.<n>The introduction of OpenAI's o1 series marks a significant milestone in this research direction.
arXiv Detail & Related papers (2025-01-16T17:37:58Z) - Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning [40.069109287947875]
We propose a novel reasoning framework called Forest-of-Thought (FoT)<n>FoT integrates multiple reasoning trees to leverage collective decision-making for solving complex logical problems.<n>FoT employs sparse activation strategies to select the most relevant reasoning paths, improving both efficiency and accuracy.
arXiv Detail & Related papers (2024-12-12T09:01:18Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.