Limited Reasoning Space: The cage of long-horizon reasoning in LLMs
- URL: http://arxiv.org/abs/2602.19281v1
- Date: Sun, 22 Feb 2026 17:28:27 GMT
- Title: Limited Reasoning Space: The cage of long-horizon reasoning in LLMs
- Authors: Zhenyu Li, Guanlin Wu, Cheems Wang, Yongqiang Zhao,
- Abstract summary: This work hypothesizes that reasoning failures with larger compute budgets stem from static planning methods.<n>We propose Halo, a model predictive control framework for planning.
- Score: 13.848126962400878
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The test-time compute strategy, such as Chain-of-Thought (CoT), has significantly enhanced the ability of large language models to solve complex tasks like logical reasoning. However, empirical studies indicate that simply increasing the compute budget can sometimes lead to a collapse in test-time performance when employing typical task decomposition strategies such as CoT. This work hypothesizes that reasoning failures with larger compute budgets stem from static planning methods, which hardly perceive the intrinsic boundaries of LLM reasoning. We term it as the Limited Reasoning Space hypothesis and perform theoretical analysis through the lens of a non-autonomous stochastic dynamical system. This insight suggests that there is an optimal range for compute budgets; over-planning can lead to redundant feedback and may even impair reasoning capabilities. To exploit the compute-scaling benefits and suppress over-planning, this work proposes Halo, a model predictive control framework for LLM planning. Halo is designed for long-horizon tasks with reason-based planning and crafts an entropy-driven dual controller, which adopts a Measure-then-Plan strategy to achieve controllable reasoning. Experimental results demonstrate that Halo outperforms static baselines on complex long-horizon tasks by dynamically regulating planning at the reasoning boundary.
Related papers
- A State-Transition Framework for Efficient LLM Reasoning [58.18141262230392]
Long Chain-of-Thought (CoT) reasoning significantly improves Large Language Models (LLMs) performance on complex reasoning tasks.<n>Existing studies usually enhance the reasoning efficiency of LLMs by compressing CoT sequences.<n>We propose an efficient reasoning framework that models the reasoning process of LLMs as a state-transition process.
arXiv Detail & Related papers (2026-02-01T12:40:40Z) - Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization [9.193078163792427]
Chain-of-Thought (CoT) empowers Large Language Models (LLMs) to tackle complex problems.<n>Recent latent reasoning approaches attempt to optimize efficiency by performing reasoning within continuous hidden states.<n>We introduce PLaT, a framework that reformulates latent reasoning as planning by fundamentally decouple reasoning from verbalization.
arXiv Detail & Related papers (2026-01-29T07:38:18Z) - PILOT: Planning via Internalized Latent Optimization Trajectories for Large Language Models [51.43746425777865]
Large Language Models (LLMs) often lack the capacity to formulate global strategies, leading to error propagation in long-horizon tasks.<n>We propose PILOT, a framework designed to internalize the strategic oversight of large models into intrinsic Latent Guidance.
arXiv Detail & Related papers (2026-01-07T12:38:56Z) - Geometrically-Constrained Agent for Spatial Reasoning [53.93718394870856]
Vision Language Models exhibit a fundamental semantic-to-geometric gap in spatial reasoning.<n>Current paradigms fail to bridge this gap.<n>We propose a training-free agentic paradigm that resolves this gap by introducing a formal task constraint.
arXiv Detail & Related papers (2025-11-27T17:50:37Z) - Enhancing Long Chain-of-Thought Reasoning through Multi-Path Plan Aggregation [32.86351316550696]
We analyze raw long CoTs and uncover a reasoning hierarchy consisting of planning and execution steps.<n>Motivated by this observation, we propose Multi-Path Plan Aggregation (MPPA), a framework that augments single-pass reasoning with plan exploration and aggregation.<n>To overcome this, we introduce online Step-DPO, a process-level preference optimization scheme that leverages Twisted Sequential Monte Carlo (TSMC) to provide scalable stepwise supervision.
arXiv Detail & Related papers (2025-10-13T17:02:41Z) - Constraints-of-Thought: A Framework for Constrained Reasoning in Language-Model-Guided Search [3.0130126601831235]
Constraints-of-Thought (Const-o-T) is a framework that enables Monte Carlo Tree Search (MCTS) focus search on semantically meaningful paths.<n>We demonstrate that Const-o-T offers a generalizable foundation for constraint-guided reasoning, enabling more efficient, constraint-aligned, and domain-adaptable planning.
arXiv Detail & Related papers (2025-10-10T04:21:18Z) - Stop Spinning Wheels: Mitigating LLM Overthinking via Mining Patterns for Early Reasoning Exit [114.83867400179354]
Overthinking can degrade overall performance of large language models.<n>We categorize reasoning into three stages: insufficient exploration stage, compensatory reasoning stage, and reasoning convergence stage.<n>We develop a lightweight thresholding strategy based on rules to improve reasoning accuracy.
arXiv Detail & Related papers (2025-08-25T03:17:17Z) - Reasoning on a Budget: A Survey of Adaptive and Controllable Test-Time Compute in LLMs [45.83245433138508]
Large language models (LLMs) have rapidly progressed into general-purpose agents capable of solving a broad spectrum of tasks.<n>They apply fixed inference-time compute regardless of task complexity, often overthinking simple problems while underthinking hard ones.<n>This survey presents a comprehensive review of efficient test-time compute strategies, which aim to improve the computational efficiency of LLM reasoning.
arXiv Detail & Related papers (2025-07-02T18:27:42Z) - Computational Thinking Reasoning in Large Language Models [69.28428524878885]
Computational Thinking Model (CTM) is a novel framework that incorporates computational thinking paradigms into large language models (LLMs)<n>Live code execution is seamlessly integrated into the reasoning process, allowing CTM to think by computing.<n>CTM outperforms conventional reasoning models and tool-augmented baselines in terms of accuracy, interpretability, and generalizability.
arXiv Detail & Related papers (2025-06-03T09:11:15Z) - Scalable Chain of Thoughts via Elastic Reasoning [61.75753924952059]
Elastic Reasoning is a novel framework for scalable chain of thoughts.<n>It separates reasoning into two phases--thinking and solution--with independently allocated budgets.<n>Our approach produces more concise and efficient reasoning even in unconstrained settings.
arXiv Detail & Related papers (2025-05-08T15:01:06Z) - Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up [9.42385235462794]
Large language models (LLMs) have shown remarkable performance in reasoning tasks but face limitations in mathematical and complex logical reasoning.<n>We propose Reversal of Thought (RoT) to enhance the logical reasoning abilities of LLMs during the warm-up phase prior to batch inference.<n>RoT utilizes a Preference-Guided Reverse Reasoning warm-up strategy, which integrates logical symbols for pseudocode planning.
arXiv Detail & Related papers (2024-10-16T07:44:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.