Fugu-MT 論文翻訳(概要): ToolTweak: An Attack on Tool Selection in LLM-based Agents

論文の概要: ToolTweak: An Attack on Tool Selection in LLM-based Agents

arxiv url: http://arxiv.org/abs/2510.02554v1
Date: Thu, 02 Oct 2025 20:44:44 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-06 16:35:52.169649
Title: ToolTweak: An Attack on Tool Selection in LLM-based Agents
Title（参考訳）: ToolTweak: LLMエージェントにおけるツール選択攻撃
Authors: Jonathan Sneh, Ruomei Yan, Jialin Yu, Philip Torr, Yarin Gal, Sunando Sengupta, Eric Sommerlade, Alasdair Paren, Adel Bibi,
Abstract要約: 対戦相手は,特定のツールの選択に対して,エージェントを体系的にバイアスし,等しく有能な代替手段に対して不公平な優位性を得ることができることを示す。提案するToolTweakは,ベースラインの20%程度から最大81%までの選択率を向上する,軽量自動攻撃である。これらのリスクを軽減するために、パラフレージングとパープレキシティ・フィルタリングという2つの防御効果を評価し、バイアスを低減し、エージェントが機能的に類似したツールをより平等に選択できるようにする。
参考スコア（独自算出の注目度）: 52.17181489286236
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As LLMs increasingly power agents that interact with external tools, tool use has become an essential mechanism for extending their capabilities. These agents typically select tools from growing databases or marketplaces to solve user tasks, creating implicit competition among tool providers and developers for visibility and usage. In this paper, we show that this selection process harbors a critical vulnerability: by iteratively manipulating tool names and descriptions, adversaries can systematically bias agents toward selecting specific tools, gaining unfair advantage over equally capable alternatives. We present ToolTweak, a lightweight automatic attack that increases selection rates from a baseline of around 20% to as high as 81%, with strong transferability between open-source and closed-source models. Beyond individual tools, we show that such attacks cause distributional shifts in tool usage, revealing risks to fairness, competition, and security in emerging tool ecosystems. To mitigate these risks, we evaluate two defenses: paraphrasing and perplexity filtering, which reduce bias and lead agents to select functionally similar tools more equally. All code will be open-sourced upon acceptance.
Abstract（参考訳）: LLMが外部ツールと対話するパワーエージェントが増えるにつれて、ツールの使用は、その機能を拡張する上で不可欠なメカニズムになっている。これらのエージェントは一般的に、ユーザタスクを解決するために、データベースやマーケットプレースからツールを選択する。本稿では,ツール名や記述を反復的に操作することで,特定のツールの選択に対してエージェントを体系的にバイアスし,同等に有能な代替手段に対して不公平な優位性を得ることができることを示す。 ToolTweakは,オープンソースモデルとクローズドソースモデル間の強力な転送性を備えた,ベースラインの20%前後から最大81%までの選択率を向上する,軽量自動攻撃である。個々のツール以外にも、このような攻撃はツール利用の分散シフトを引き起こし、新興ツールエコシステムにおける公正性、競争、セキュリティに対するリスクを明らかにします。これらのリスクを軽減するために、パラフレージングとパープレキシティ・フィルタリングという2つの防御効果を評価し、バイアスを低減し、エージェントが機能的に類似したツールをより平等に選択できるようにする。すべてのコードは、受け入れ次第オープンソースになる。

論文の概要: ToolTweak: An Attack on Tool Selection in LLM-based Agents

関連論文リスト