Scalable Chain of Thoughts via Elastic Reasoning
- URL: http://arxiv.org/abs/2505.05315v1
- Date: Thu, 08 May 2025 15:01:06 GMT
- Title: Scalable Chain of Thoughts via Elastic Reasoning
- Authors: Yuhui Xu, Hanze Dong, Lei Wang, Doyen Sahoo, Junnan Li, Caiming Xiong,
- Abstract summary: Elastic Reasoning is a novel framework for scalable chain of thoughts.<n>It separates reasoning into two phases--thinking and solution--with independently allocated budgets.<n>It produces more concise and efficient reasoning even in unconstrained settings.
- Score: 61.75753924952059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large reasoning models (LRMs) have achieved remarkable progress on complex tasks by generating extended chains of thought (CoT). However, their uncontrolled output lengths pose significant challenges for real-world deployment, where inference-time budgets on tokens, latency, or compute are strictly constrained. We propose Elastic Reasoning, a novel framework for scalable chain of thoughts that explicitly separates reasoning into two phases--thinking and solution--with independently allocated budgets. At test time, Elastic Reasoning prioritize that completeness of solution segments, significantly improving reliability under tight resource constraints. To train models that are robust to truncated thinking, we introduce a lightweight budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training. Empirical results on mathematical (AIME, MATH500) and programming (LiveCodeBench, Codeforces) benchmarks demonstrate that Elastic Reasoning performs robustly under strict budget constraints, while incurring significantly lower training cost than baseline methods. Remarkably, our approach also produces more concise and efficient reasoning even in unconstrained settings. Elastic Reasoning offers a principled and practical solution to the pressing challenge of controllable reasoning at scale.
Related papers
- Steering LLM Thinking with Budget Guidance [48.65894557568655]
Budget guidance is a method for steering the reasoning process of LLMs toward a target budget without requiring any fine-tuning.<n>Our approach introduces a lightweight predictor that models a Gamma distribution over the remaining thinking length.<n>This signal is then used to guide generation in a soft, token-level manner, ensuring that the overall reasoning trace adheres to the specified thinking budget.
arXiv Detail & Related papers (2025-06-16T17:57:05Z) - Collaborative LLM Inference via Planning for Efficient Reasoning [50.04696654679751]
We propose a test-time collaboration framework in which a planner model first generates a plan, defined as a distilled and high-level abstraction of the problem.<n>Small and large models take turns acting as planner and reasoner, exchanging plans in a multi-round cascade to collaboratively solve complex tasks.<n>Our method achieves accuracy comparable to strong proprietary models alone, while significantly reducing reliance on paid inference.
arXiv Detail & Related papers (2025-06-13T08:35:50Z) - PixelThink: Towards Efficient Chain-of-Pixel Reasoning [70.32510083790069]
PixelThink is a simple yet effective scheme that integrates externally estimated task difficulty and internally measured model uncertainty.<n>It learns to compress reasoning length in accordance with scene complexity and predictive confidence.<n> Experimental results demonstrate that the proposed approach improves both reasoning efficiency and overall segmentation performance.
arXiv Detail & Related papers (2025-05-29T17:55:49Z) - Reinforced Latent Reasoning for LLM-based Recommendation [83.18146814163308]
Large Language Models (LLMs) have demonstrated impressive reasoning capabilities in complex problem-solving tasks.<n>Existing methods typically rely on fine-tuning with explicit chain-of-thought (CoT) data.<n>In this work, we explore an alternative approach that shifts from explicit CoT reasoning to compact, information-dense latent reasoning.
arXiv Detail & Related papers (2025-05-25T11:03:45Z) - Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning [19.258292534503887]
Plan-and-Budget is a model-agnostic, test-time framework that decomposes complex queries into sub-questions and allocates token budgets based on estimated complexity using adaptive scheduling.<n>Plan-and-Budget improves reasoning efficiency across a range of tasks and models, achieving up to +70% accuracy gains, tangential -39% token reduction, and +187.5% improvement in $E3$.
arXiv Detail & Related papers (2025-05-22T01:56:29Z) - Let LLMs Break Free from Overthinking via Self-Braking Tuning [60.08396797526657]
Large reasoning models (LRMs) have significantly enhanced their reasoning capabilities by generating longer chains of thought.<n>This performance gain comes at the cost of a substantial increase in redundant reasoning during the generation process.<n>We propose a novel framework, Self-Braking Tuning (SBT), which tackles overthinking from the perspective of allowing the model to regulate its own reasoning process.
arXiv Detail & Related papers (2025-05-20T16:53:40Z) - Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models [32.49420948390984]
Large reasoning models (LRMs) often suffer from an overthinking'' problem, where the model generates excessively redundant reasoning steps with limited performance gains.<n>We propose a simple yet efficient pipeline, Method, to enable LRMs to bypass unnecessary intermediate steps, thereby significantly reducing computational costs.
arXiv Detail & Related papers (2025-04-18T11:07:19Z) - Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilities [101.77467538102924]
Recent advancements in Large Reasoning Models (LRMs) have demonstrated remarkable performance in specialized reasoning tasks.<n>We show that acquiring deliberative reasoning capabilities significantly reduces the foundational capabilities of LRMs.<n>We demonstrate that adaptive reasoning -- employing modes like Zero-Thinking, Less-Thinking, and Summary-Thinking -- can effectively alleviate these drawbacks.
arXiv Detail & Related papers (2025-03-23T08:18:51Z) - Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models [54.04678363287392]
Large Language Models (LLMs) have demonstrated remarkable capabilities in complex tasks.<n>Recent advancements in OpenAI o1 and DeepSeek-R1 have further improved performance in System-2 reasoning domains.
arXiv Detail & Related papers (2025-03-20T17:59:38Z) - DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models [30.184895117009457]
This paper introduces Difficulty-Adaptive Slow Thinking (DAST), a novel framework that enables models to autonomously adjust the length of Chain-of-Thought (CoT) based on problem difficulty.<n>Experiments on diverse datasets and model scales demonstrate that DAST effectively mitigates overthinking while preserving reasoning accuracy on complex problems.
arXiv Detail & Related papers (2025-03-06T14:23:06Z) - Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning [40.069109287947875]
We propose a novel reasoning framework called Forest-of-Thought (FoT)<n>FoT integrates multiple reasoning trees to leverage collective decision-making for solving complex logical problems.<n>FoT employs sparse activation strategies to select the most relevant reasoning paths, improving both efficiency and accuracy.
arXiv Detail & Related papers (2024-12-12T09:01:18Z) - Synergy-of-Thoughts: Eliciting Efficient Reasoning in Hybrid Language Models [19.466985579720507]
Large language models (LLMs) have shown impressive emergent abilities in a wide range of tasks, but the associated expensive API cost greatly limits the real application.
We propose "Synergy of Thoughts"(SoT) to unleash the synergistic potential of hybrid LLMs with different scales for efficient reasoning.
We show that SoT substantially reduces the API cost by 38.3%-75.1%, and simultaneously achieves state-of-the-art reasoning accuracy and solution diversity.
arXiv Detail & Related papers (2024-02-04T16:45:01Z) - Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function.
It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.