Fugu-MT 論文翻訳(概要): Budget-Aware Tool-Use Enables Effective Agent Scaling

論文の概要: Budget-Aware Tool-Use Enables Effective Agent Scaling

arxiv url: http://arxiv.org/abs/2511.17006v1
Date: Fri, 21 Nov 2025 07:18:55 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-24 18:08:18.919581
Title: Budget-Aware Tool-Use Enables Effective Agent Scaling
Title（参考訳）: Budget-Aware Tool-Useは効果的なエージェントスケーリングを可能にする
Authors: Tengxiao Liu, Zifeng Wang, Jin Miao, I-Hung Hsu, Jun Yan, Jiefeng Chen, Rujun Han, Fangyuan Xu, Yanfei Chen, Ke Jiang, Samira Daruki, Yi Liang, William Yang Wang, Tomas Pfister, Chen-Yu Lee,
Abstract要約: 大規模言語モデル(LLM)におけるタスク間のテスト時間計算のスケーリングによるパフォーマンス向上本研究では,これらのエージェントを,Web検索エージェントを中心に,明示的なツールコール予算の下で効果的にスケールする方法について検討する。私たちは、エージェントに継続的な予算意識を提供する軽量プラグインであるBudget Trackerを紹介します。
参考スコア（独自算出の注目度）: 82.6942342482552
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Scaling test-time computation improves performance across different tasks on large language models (LLMs), which has also been extended to tool-augmented agents. For these agents, scaling involves not only "thinking" in tokens but also "acting" via tool calls. The number of tool calls directly bounds the agent's interaction with the external environment. However, we find that simply granting agents a larger tool-call budget fails to improve performance, as they lack "budget awareness" and quickly hit a performance ceiling. To address this, we study how to scale such agents effectively under explicit tool-call budgets, focusing on web search agents. We first introduce the Budget Tracker, a lightweight plug-in that provides the agent with continuous budget awareness, enabling simple yet effective scaling. We further develop BATS (Budget Aware Test-time Scaling), an advanced framework that leverages this awareness to dynamically adapt its planning and verification strategy, deciding whether to "dig deeper" on a promising lead or "pivot" to new paths based on remaining resources. To analyze cost-performance scaling in a controlled manner, we formalize a unified cost metric that jointly accounts for token and tool consumption. We provide the first systematic study on budget-constrained agents, showing that budget-aware methods produce more favorable scaling curves and push the cost-performance Pareto frontier. Our work offers empirical insights toward a more transparent and principled understanding of scaling in tool-augmented agents.
Abstract（参考訳）: テスト時間計算のスケーリングは、ツール拡張されたエージェントにも拡張された大規模言語モデル(LLM)上のさまざまなタスクのパフォーマンスを改善する。これらのエージェントには、トークンの"検討"だけでなく、ツールコールによる"実行"も必要です。ツール呼び出しの数は、エージェントの外部環境との相互作用を直接束縛する。しかし、単にエージェントにより大きなツールコール予算を与えるだけでは、"予算の意識"が欠如し、すぐにパフォーマンスの天井に達するため、パフォーマンスを改善することができません。そこで本研究では,これらのエージェントを,Web検索エージェントを中心に,明示的なツールコール予算の下で効果的にスケールする方法について検討する。最初にBudget Trackerを紹介します。これは軽量なプラグインで、エージェントに継続的な予算意識を提供し、シンプルで効果的なスケーリングを可能にします。 BATS(Budget Aware Test-time Scaling)は、この認識を活用して計画と検証戦略を動的に適応する高度なフレームワークで、有望なリードで"深く"進むか、あるいは残りのリソースに基づいた新たなパスに"ピボット"するかを判断する。コストパフォーマンスのスケーリングを制御された方法で解析するため,トークンとツールの消費を共同で考慮した統一コスト指標を定式化した。予算制約のあるエージェントに関する最初の体系的研究を行い、予算制約のある手法がより好適なスケーリング曲線を生み出し、コストパフォーマンスのParetoフロンティアを推し進めることを示す。私たちの研究は、ツール強化エージェントのスケーリングに関する、より透明で原則化された理解に対する実証的な洞察を提供する。

論文の概要: Budget-Aware Tool-Use Enables Effective Agent Scaling

関連論文リスト