Adaptive Overclocking: Dynamic Control of Thinking Path Length via Real-Time Reasoning Signals
- URL: http://arxiv.org/abs/2509.17000v1
- Date: Sun, 21 Sep 2025 09:40:27 GMT
- Title: Adaptive Overclocking: Dynamic Control of Thinking Path Length via Real-Time Reasoning Signals
- Authors: Shuhao Jiang, Songbo Wang, Yang Qiao, Chun Xu, Chaoyang Zheng, Shengyi Zhou, Huanjun Wang, Fangming Li, Cong Zhang, Jiyu Wang,
- Abstract summary: We propose Adaptive Overclocking, a method that makes the hyper parameter $alpha$ dynamic and context-aware.<n>Our method adjusts reasoning speed in real time through two complementary signals.<n> Experiments on GSM8K, MATH, and SVAMP show that HAC achieves superior accuracy-latency trade-offs.
- Score: 8.264189366042675
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large Reasoning Models (LRMs) often suffer from computational inefficiency due to overthinking, where a fixed reasoning budget fails to match the varying complexity of tasks. To address this issue, we propose Adaptive Overclocking, a method that makes the overclocking hyperparameter $\alpha$ dynamic and context-aware. Our method adjusts reasoning speed in real time through two complementary signals: (1) token-level model uncertainty for fine-grained step-wise control, and (2) input complexity estimation for informed initialization. We implement this approach with three strategies: Uncertainty-Aware Alpha Scheduling (UA-$\alpha$S), Complexity-Guided Alpha Initialization (CG-$\alpha$I), and a Hybrid Adaptive Control (HAC) that combines both. Experiments on GSM8K, MATH, and SVAMP show that HAC achieves superior accuracy-latency trade-offs, reducing unnecessary computation on simple problems while allocating more resources to challenging ones. By mitigating overthinking, Adaptive Overclocking enhances both efficiency and overall reasoning performance.
Related papers
- Quasi-Periodic Gaussian Process Predictive Iterative Learning Control [7.0206339525686055]
Iterative learning control (ILC) improves performance by using information from previous iterations to compensate for expected errors in future iterations.<n>This work incorporates the use of Quasi-Periodic Gaussian Processes (QPGPs) into a predictive ILC framework.<n>We benchmark the method against both standard ILC and conventional GP-based predictive ILC on three tasks.
arXiv Detail & Related papers (2026-02-20T06:10:10Z) - Logit-Entropy Adaptive Stopping Heuristic for Efficient Chain-of-Thought Reasoning [0.0]
Chain-of-Thought (CoT) prompting is a key technique for enabling complex reasoning in large language models.<n>We introduce LEASH: Logit-Entropy Adaptive Stopping Heuristic, a training-free decoding algorithm that adaptively halts rationale generation.
arXiv Detail & Related papers (2025-11-06T18:43:16Z) - DART: Difficulty-Adaptive Reasoning Truncation for Efficient Large Language Models [36.962276192354174]
textbfDART adjusts thinking length according to problem difficulty.<n>textbfTruncation framework learns when to stop thinking''
arXiv Detail & Related papers (2025-11-03T02:41:20Z) - e1: Learning Adaptive Control of Reasoning Effort [88.51897900019485]
Increasing the thinking budget of AI models can significantly improve accuracy, but not all questions warrant the same amount of reasoning.<n>Users may prefer to allocate different amounts of reasoning effort depending on how they value output quality versus latency and cost.<n>We propose Adaptive Effort Control, a self-adaptive reinforcement learning method that trains models to use a user-specified fraction of tokens.
arXiv Detail & Related papers (2025-10-30T23:12:21Z) - A1: Asynchronous Test-Time Scaling via Conformal Prediction [112.54016379556073]
Large language models (LLMs) benefit from test-time scaling, but existing methods face significant challenges.<n>We introduce A1 (Asynchronous Test-Time Scaling), a statistically guaranteed adaptive inference framework that addresses these challenges.<n>A1 achieves a remarkable 56.7x speedup in test-time scaling and a 4.14x improvement in throughput.
arXiv Detail & Related papers (2025-09-18T16:55:09Z) - Controlling Thinking Speed in Reasoning Models [41.72496532709135]
Human cognition operates in two modes: fast, intuitive System 1 thinking and slow, deliberate System 2 thinking.<n>In this work, we enable LRMs to approximate human intelligence through dynamic thinking speed adjustment.<n>Our approach addresses two key questions: (1) how to control thinking speed in LRMs, and (2) when to adjust it for optimal performance.
arXiv Detail & Related papers (2025-07-04T16:41:06Z) - Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute [57.16286134405821]
We propose Fractional Reasoning, a framework that enables continuous control over reasoning intensity at inference time.<n>Our method operates by extracting the latent steering vector associated with deeper reasoning and reapplying it with a tunable scaling factor.<n> Experiments on GSM8K, MATH500, and GPQA demonstrate that Fractional Reasoning consistently improves performance across diverse reasoning tasks and models.
arXiv Detail & Related papers (2025-06-18T21:15:59Z) - Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition [95.54406667705999]
Pangu Embedded is an efficient Large Language Model (LLM) reasoner developed on Ascend Neural Processing Units (NPUs)<n>It addresses the significant computational costs and inference latency challenges prevalent in existing reasoning-optimized LLMs.<n>It delivers rapid responses and state-of-the-art reasoning quality within a single, unified model architecture.
arXiv Detail & Related papers (2025-05-28T14:03:02Z) - ARM: Adaptive Reasoning Model [36.53965139929349]
We propose Adaptive Reasoning Model (ARM), a reasoning model capable of adaptively selecting appropriate formats based on the task at hand.<n>Ada-GRPO enables ARM to achieve high token efficiency, reducing tokens by an average of 30%, and up to 70%, while maintaining performance comparable to the model that relies solely on Long CoT.
arXiv Detail & Related papers (2025-05-26T17:38:50Z) - PATS: Process-Level Adaptive Thinking Mode Switching [53.53401063490537]
Current large-language models (LLMs) typically adopt a fixed reasoning strategy, either simple or complex, for all questions, regardless of their difficulty.<n>This neglect of variation in task and reasoning process complexity leads to an imbalance between performance and efficiency.<n>Existing methods attempt to implement training-free fast-slow thinking system switching to handle problems of varying difficulty, but are limited by coarse-grained solution-level strategy adjustments.<n>We propose a novel reasoning paradigm: Process-Level Adaptive Thinking Mode Switching (PATS), which enables LLMs to dynamically adjust their reasoning strategy based on the difficulty of each step, optimizing the balance between
arXiv Detail & Related papers (2025-05-25T17:58:50Z) - Non-Asymptotic Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes [4.169915659794567]
This work presents the first finite-time analysis for the last-iterate convergence of average-reward Q-learning with an asynchronous implementation.<n>A key feature of the algorithm we study is the use of adaptive stepsizes, which serve as local clocks for each state-action pair.
arXiv Detail & Related papers (2025-04-25T23:41:14Z) - Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP.
DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize.
We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.