Fugu-MT 論文翻訳(概要): GAMBIT: A Three-Mode Benchmark for Adversarial Robustness in Multi-Agent LLM Collectives

論文の概要: GAMBIT: A Three-Mode Benchmark for Adversarial Robustness in Multi-Agent LLM Collectives

arxiv url: http://arxiv.org/abs/2605.09027v2
Date: Wed, 13 May 2026 07:49:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-14 17:13:58.826048
Title: GAMBIT: A Three-Mode Benchmark for Adversarial Robustness in Multi-Agent LLM Collectives
Title（参考訳）: GAMBIT:マルチエージェントLCM集合体における逆ロバスト性のための3モードベンチマーク
Authors: Alexandre Le Mercier, Chris Develder, Thomas Demeester,
Abstract要約: GAMBITは、インポスタ検出器を評価するための3つの評価モードと2つの独立したスコアを持つベンチマークである。ベンチマークには、240の共進化型インポスタ戦略にまたがる27,804のラベル付きインスタンスのデータセットが付属している。
参考スコア（独自算出の注目度）: 48.545980031973556
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In multi-agent systems (MAS), a single deceptive agent can nullify all gains of an agentic AI collective and evade deployed defenses. However, existing adversarial studies on MAS target only shallow tasks and do not consider adaptive adversaries, which evolve their strategies to evade the very detectors trained to catch them. To address that gap, we introduce GAMBIT, a benchmark with three evaluation modes and two independent scores for evaluating imposter detectors: the first two modes measure zero-shot detection under increasing distribution shift, and a third recalibration mode measures how quickly a detector adapts to novel attacks from just 20 labeled examples. The benchmark comes with a dataset of 27,804 labeled instances spanning 240 co-evolved imposter strategies. Our contributions are threefold: (1) Using chess as a substrate deep reasoning problem and Gemini 3.1 Pro for agents, we release GAMBIT and its dataset to evaluate imposter detectors under realistic constraints against a stealthy adaptive imposter; (2) We introduce an adaptive imposter agent based on an efficient evolutionary framework, generalizable beyond chess, that collapses collective task performance while remaining essentially undetectable (50.5% F1-score with a Gemini-based detector); (3) We show that zero-shot evaluation can be highly misleading for adaptive adversaries: two detectors with near-identical zero-shot scores differ by 8x on few-shot adaptation, while the meta-learned variant converges 20x faster, a gap only visible in the recalibration mode. Altogether, GAMBIT provides the first multi-agent benchmark where adversarial attacks and defenses co-evolve, with an imposter framework generalizable beyond our use case, and promising techniques for fast recalibration in a rapidly evolving adversarial system. Code and data: https://anonymous.4open.science/r/gambit.
Abstract（参考訳）: マルチエージェントシステム(MAS)では、エージェントAI集団の利益をすべて無効化し、デプロイされた防御を回避できる。しかし、MASの既存の敵研究は、浅いタスクのみを対象としており、適応的な敵を考慮していない。このギャップに対処するために,3つの評価モードと2つの独立スコアを備えたインポスタ検出器評価ベンチマークであるGAMBITを導入し,第1の2モードは分布シフトの増加によるゼロショット検出を計測し,第3のリカレーションモードは,わずか20個のラベル付き例からの新しい攻撃にいかに迅速に適応するかを計測する。ベンチマークには、240の共進化型インポスタ戦略にまたがる27,804のラベル付きインスタンスのデータセットが付属している。我々は,(1) チェスを基質の深層推論問題として用いること,(2) GAMBITとそのデータセットを公開して,現実的な制約下でインポスタ検出器の評価を行うこと,(2) 効率のよい進化的枠組みに基づく適応型インポスタエージェントを導入すること,(2) 基本的には検出不能でありながら集合的タスク性能を崩壊させること,(3) ゼロショット評価は適応的敵に対して非常に誤解を招くこと,(3) メタ学習型ゼロショットスコアを持つ2つの検出器は,少数ショット適応において8倍の精度で,メタ学習型変種は20倍の速度で収束し,可視的リカレーションモードにおいてのみ可視なギャップを生じること,の3点を示した。 GAMBITは、敵攻撃と防衛が共進化する最初のマルチエージェントベンチマークであり、我々のユースケースを超えて一般化可能なインポスタフレームワークと、急速に進化する敵システムにおける迅速な再校正のための有望な技術を備えている。コードとデータ:https://anonymous.4open.science/r/gambit

論文の概要: GAMBIT: A Three-Mode Benchmark for Adversarial Robustness in Multi-Agent LLM Collectives

関連論文リスト