Fugu-MT 論文翻訳(概要): To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

論文の概要: To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

arxiv url: http://arxiv.org/abs/2605.00737v1
Date: Fri, 01 May 2026 15:38:13 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-04 17:43:28.99969
Title: To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling
Title（参考訳）: コールするかどうか: LLMツール呼び出しを評価し最適化するためのフレームワーク
Authors: Qinyuan Wu, Soumi Das, Mahsa Amani, Arijit Nag, Seungeon Lee, Krishna P. Gummadi, Abhilasha Ravichander, Muhammad Bilal Zafar,
Abstract要約: 本稿では,Web検索ツールの利用判断を評価するために,意思決定理論に着想を得た原則的フレームワークを提案する。モデルが認識するツールコールの必要性とユーティリティは、多くの場合、その真のニーズとユーティリティと不一致である。我々の推定器は、意思決定品質を向上し、タスク性能を向上するシンプルなコントローラを可能にする。
参考スコア（独自算出の注目度）: 13.42769424615184
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agentic AI architectures augment LLMs with external tools, unlocking strong capabilities. However, tool use is not always beneficial; some calls may be redundant or even harmful. Effective tool use, therefore, hinges on a core LLM decision: whether to call or not call a tool, when performing a task. This decision is particularly challenging for web search tools, where the benefits of external information depend on the model's internal knowledge and its ability to integrate potentially noisy tool responses. We introduce a principled framework inspired by decision-making theory to evaluate web search tool-use decisions along three key factors: necessity, utility, and affordability. Our analysis combines two complementary lenses: a normative perspective that infers true need and utility from an optimal allocation of tool calls, and a descriptive perspective that infers the model's self-perceived need and utility from their observed behaviors. We find that models' perceived need and utility of tool calls are often misaligned with their true need and utility. Building on this framework, we train lightweight estimators of need and utility based on models' hidden states. Our estimators enable simple controllers that can improve decision quality and lead to stronger task performance than the self-perceived set up across three tasks and six models.
Abstract（参考訳）: エージェントAIアーキテクチャはLLMを外部ツールで強化し、強力な機能をアンロックする。しかし、ツールの使用は必ずしも有益であるとは限らない。したがって、効果的なツールの使用は、タスクを実行する際に、ツールを呼び出すかどうかという、中核的なLCMの判断に依存します。この決定は、外部情報の利点がモデルの内部知識と潜在的にノイズの多いツール応答を統合する能力に依存するWeb検索ツールにとって特に困難である。本稿では,Web検索ツールの利用に関する意思決定理論に着想を得た原則的フレームワークを提案する。ツールコールの最適な割り当てから真のニーズと実用性を推測する規範的視点と、モデルが観察した振る舞いから自己認識されたニーズと実用性を推測する記述的視点の2つの相補的レンズを組み合わせる。モデルが認識するツールコールのニーズとユーティリティは、多くの場合、その真のニーズとユーティリティと不一致である。このフレームワークに基づいて、モデルが隠した状態に基づいて、ニーズとユーティリティの軽量な推定器を訓練します。我々の推定器は,3つのタスクと6つのモデルにまたがる自己認識よりも,意思決定品質を向上し,タスク性能を向上するシンプルなコントローラを実現する。

論文の概要: To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

関連論文リスト