Related papers: More with Less: An Empirical Study of Turn-Control Strategies for Efficient Coding Agents

More with Less: An Empirical Study of Turn-Control Strategies for Efficient Coding Agents

URL: http://arxiv.org/abs/2510.16786v1
Date: Sun, 19 Oct 2025 10:32:18 GMT
Title: More with Less: An Empirical Study of Turn-Control Strategies for Efficient Coding Agents
Authors: Pengfei Gao, Chao Peng,
Abstract summary: Coding agents operate in iterative loops (turns) to solve software engineering tasks.<n>They are becoming increasingly powerful, but their practical deployment is hindered by significant and unpredictable costs.<n>We show that a fixed-turn limit, specifically at the 75th percentile of the baseline, serves as a "sweet spot"<n>We then show that a fixed-turn strategy consistently outperforms fixed-limit approaches, achieving comparable or better solve rates while further reducing costs by an additional 12%-24% by intelligently allocating resources only to tasks that need them.
Score: 4.980051859336524
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: LLM-powered coding agents, which operate in iterative loops (turns) to solve software engineering tasks, are becoming increasingly powerful. However, their practical deployment is hindered by significant and unpredictable costs. This challenge arises from a combination of factors: quadratically growing token counts with each turn, the high price of models, the large number of turns required for real-world tasks, and the tendency of agents to take inefficient or unnecessary actions. While existing research focuses on optimizing individual turns, the strategic control of the total number of turns remains an underexplored area for managing agent performance and cost. To address this gap, we conduct a comprehensive empirical study on SWE-bench using three state-of-the-art models and evaluate the impact of three distinct turn-control strategies: an unrestricted baseline, a fixed-turn limit with reminders, and a novel dynamic-turn strategy that grants extensions on-demand. Our findings first reveal a fundamental trade-off in the unrestricted setting, where no single model excels across performance, cost, and turn efficiency. We then show that a fixed-turn limit, specifically at the 75th percentile of the baseline, serves as a "sweet spot", substantially reducing costs (by 24%-68%) with minimal impact on solve rates. Most significantly, the dynamic-turn strategy consistently outperforms fixed-limit approaches, achieving comparable or better solve rates while further reducing costs by an additional 12%-24% by intelligently allocating resources only to tasks that need them. This work provides the first systematic analysis of turn-control strategies, offering simple yet effective guidelines for developers to balance cost and efficacy. We demonstrate that dynamic resource allocation is a superior, easy-to-implement approach for deploying powerful yet economically viable coding agents.

Related papers

ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference [60.958331943869126]
ODAR-Expert is an adaptive routing framework that optimize the accuracy-efficiency trade-off via principled resource allocation.<n>We show strong and consistent gains, including 98.2% accuracy on MATH and 54.8% on Humanity's Last Exam.
arXiv Detail & Related papers (2026-02-27T05:22:01Z)
DEPO: Dual-Efficiency Preference Optimization for LLM Agents [75.6723341304463]
We propose DEPO, a dual-efficiency preference optimization method that jointly rewards succinct responses and fewer action steps.<n>Experiments on WebShop and BabyAI show that DEPO cuts token usage by up to 60.9% and steps by up to 26.9%, while achieving up to a 29.3% improvement in performance.
arXiv Detail & Related papers (2025-11-19T12:38:43Z)
Dynamic Speculative Agent Planning [57.630218933994534]
Large language-model-based agents face critical deployment challenges due to prohibitive latency and inference costs.<n>We introduce Dynamic Speculative Planning (DSP), an online reinforcement learning framework that provides lossless acceleration with substantially reduced costs.<n>Experiments on two standard agent benchmarks demonstrate that DSP achieves comparable efficiency to the fastest acceleration method while reducing total cost by 30% and unnecessary cost up to 60%.
arXiv Detail & Related papers (2025-09-02T03:34:36Z)
Efficient Agents: Building Effective Agents While Reducing Cost [48.65558640786415]
Large Language Model (LLM)-driven agents have enabled sophisticated systems to tackle complex, multi-step tasks.<n>This work presents the first systematic study of the efficiency-effectiveness trade-off in modern agent systems.
arXiv Detail & Related papers (2025-07-24T17:56:51Z)
How Far Are We from Optimal Reasoning Efficiency? [23.593914897406943]
Large Reasoning Models (LRMs) demonstrate remarkable problem-solving capabilities through extended Chain-of-Thought (CoT) reasoning.<n>LRMs often produce excessively verbose and redundant reasoning traces.<n>Existing fine-tuning methods aim to improve reasoning efficiency, but assessing their efficiency gains remains challenging.
arXiv Detail & Related papers (2025-06-08T12:18:50Z)
Speculative Reward Model Boosts Decision Making Ability of LLMs Cost-Effectively [13.40488551654639]
We introduce the 3E Criteria to assess the cost-effectiveness of search strategies.<n>We propose the Speculative Reward Model (SRM), a plug-and-play framework that integrates seamlessly with existing search strategies.<n> Experimental results show that RM reduces costs to 1/10 of the original search framework on average while maintaining effectiveness.
arXiv Detail & Related papers (2025-05-31T05:32:12Z)
COSMOS: Predictable and Cost-Effective Adaptation of LLMs [21.91455944905485]
Large language models (LLMs) achieve remarkable performance across numerous tasks by using a diverse array of adaptation strategies.<n>We introduce COSMOS, a unified prediction framework that efficiently estimates adaptation outcomes at minimal cost.
arXiv Detail & Related papers (2025-04-30T02:06:26Z)
Self-Regulation and Requesting Interventions [63.5863047447313]
We propose an offline framework that trains a "helper" policy to request interventions.<n>We score optimal intervention timing with PRMs and train the helper model on these labeled trajectories.<n>This offline approach significantly reduces costly intervention calls during training.
arXiv Detail & Related papers (2025-02-07T00:06:17Z)
Bridging the Reasoning Gap: Small LLMs Can Plan with Generalised Strategies [0.9093413254392775]
We propose two approaches to enhance the reasoning ability of less resource-intensive models.<n>One is to provide them with a generalised strategy for solving tasks within a given domain, generated by a more resource-intensive model.<n>The other is to exploit their cost-effectiveness by iteratively prompting these models to correct errors in their proposed solutions.
arXiv Detail & Related papers (2025-01-31T00:28:29Z)
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection [80.63946798650653]
Decision centers on whether to use a large LLM with better performance or a smaller one with reduced costs. We propose a simpler solution; we use only the uncertainty of the generations of the small LLM as the decision criterion. Our experiments reveal this simple solution optimally balances cost and performance, outperforming existing methods on 25 out of 27 experimental setups.
arXiv Detail & Related papers (2024-05-03T14:38:59Z)
Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning [42.303733194571905]
We seek to find and automatize an optimal credit card limit adjustment policy by employing reinforcement learning techniques. Our research establishes a conceptual structure for applying reinforcement learning framework to credit limit adjustment.
arXiv Detail & Related papers (2023-06-27T16:10:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.