Fugu-MT 論文翻訳(概要): BiasBusters: Uncovering and Mitigating Tool Selection Bias in Large Language Models

論文の概要: BiasBusters: Uncovering and Mitigating Tool Selection Bias in Large Language Models

arxiv url: http://arxiv.org/abs/2510.00307v1
Date: Tue, 30 Sep 2025 22:02:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-03 16:59:20.268895
Title: BiasBusters: Uncovering and Mitigating Tool Selection Bias in Large Language Models
Title（参考訳）: BiasBusters: 大規模言語モデルにおけるツール選択バイアスの発見と緩和
Authors: Thierry Blankenstein, Jialin Yu, Zixuan Li, Vassilis Plachouras, Sunando Sengupta, Philip Torr, Yarin Gal, Alasdair Paren, Adel Bibi,
Abstract要約: 大規模言語モデル(LLM)は、複数のプロバイダが機能的に同等のオプションを提供するマーケットプレースから引き出された外部ツールに依存していることが多い。選択が体系的に偏りがある場合、ユーザエクスペリエンスを低下させ、競争を歪ませることができます。ツール選択バイアスを評価するために,複数の機能的に等価なツールを含む多種多様なツールカテゴリのベンチマークを導入する。
参考スコア（独自算出の注目度）: 55.119657444627855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Agents backed by large language models (LLMs) often rely on external tools drawn from marketplaces where multiple providers offer functionally equivalent options. This raises a critical point concerning fairness: if selection is systematically biased, it can degrade user experience and distort competition by privileging some providers over others. We introduce a benchmark of diverse tool categories, each containing multiple functionally equivalent tools, to evaluate tool-selection bias. Using this benchmark, we test seven models and show that unfairness exists with models either fixating on a single provider or disproportionately preferring earlier-listed tools in context. To investigate the origins of this bias, we conduct controlled experiments examining tool features, metadata (name, description, parameters), and pre-training exposure. We find that: (1) semantic alignment between queries and metadata is the strongest predictor of choice; (2) perturbing descriptions significantly shifts selections; and (3) repeated pre-training exposure to a single endpoint amplifies bias. Finally, we propose a lightweight mitigation that first filters the candidate tools to a relevant subset and then samples uniformly, reducing bias while preserving good task coverage. Our findings highlight tool-selection bias as a key obstacle for the fair deployment of tool-augmented LLMs.
Abstract（参考訳）: 大規模言語モデル(LLM)が支援するエージェントは、複数のプロバイダが機能的に同等のオプションを提供するマーケットプレースから引き出された外部ツールに依存することが多い。選択が体系的にバイアスを受ければ、ユーザエクスペリエンスを低下させ、一部のプロバイダを他よりも保護することで競争を歪めることができます。ツール選択バイアスを評価するために,複数の機能的に等価なツールを含む多種多様なツールカテゴリのベンチマークを導入する。このベンチマークを用いて、我々は7つのモデルをテストし、単一のプロバイダに固定するか、事前にリストされたツールを文脈で好んで選択するかのいずれかで不公平性が存在することを示す。このバイアスの起源を調べるために、ツールの特徴、メタデータ(名前、記述、パラメータ)、および事前学習露光について、制御実験を行った。 1) クエリとメタデータ間のセマンティックアライメントが選択の最も強力な予測要因であること,(2) 摂動記述が選択を著しくシフトすること,(3) 単一エンドポイントへの繰り返し事前学習がバイアスを増幅すること,などが分かる。最後に、まず候補ツールを関連するサブセットにフィルタリングし、次に一様にサンプリングし、優れたタスクカバレッジを維持しながらバイアスを低減する軽量な緩和法を提案する。ツール拡張LDMの公平な展開には,ツール選択バイアスが重要な障害となる。

論文の概要: BiasBusters: Uncovering and Mitigating Tool Selection Bias in Large Language Models

関連論文リスト