Fugu-MT 論文翻訳(概要): AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks

論文の概要: AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks

arxiv url: http://arxiv.org/abs/2602.16901v1
Date: Wed, 18 Feb 2026 21:30:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-20 15:21:28.416624
Title: AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks
Title（参考訳）: AgentLAB:LLMエージェントの長期攻撃に対するベンチマーク
Authors: Tanqiu Jiang, Yuhui Wang, Jiacheng Liang, Ting Wang,
Abstract要約: 我々はAgentLABを,適応型長期攻撃に対するエージェント感受性を評価するための最初のベンチマークとして提示する。 AgentLABはインテントハイジャック、ツールチェーン、タスクインジェクション、客観的ドリフト、メモリ中毒を含む5つの新しい攻撃タイプをサポートしている。 LLMの代表的エージェントは、長期にわたる攻撃の影響を受けやすいままである。
参考スコア（独自算出の注目度）: 10.74152341304056
License: http://creativecommons.org/licenses/by/4.0/
Abstract: LLM agents are increasingly deployed in long-horizon, complex environments to solve challenging problems, but this expansion exposes them to long-horizon attacks that exploit multi-turn user-agent-environment interactions to achieve objectives infeasible in single-turn settings. To measure agent vulnerabilities to such risks, we present AgentLAB, the first benchmark dedicated to evaluating LLM agent susceptibility to adaptive, long-horizon attacks. Currently, AgentLAB supports five novel attack types including intent hijacking, tool chaining, task injection, objective drifting, and memory poisoning, spanning 28 realistic agentic environments, and 644 security test cases. Leveraging AgentLAB, we evaluate representative LLM agents and find that they remain highly susceptible to long-horizon attacks; moreover, defenses designed for single-turn interactions fail to reliably mitigate long-horizon threats. We anticipate that AgentLAB will serve as a valuable benchmark for tracking progress on securing LLM agents in practical settings. The benchmark is publicly available at https://tanqiujiang.github.io/AgentLAB_main.
Abstract（参考訳）: LLMエージェントは、課題を解決するために、長期的かつ複雑な環境にますます展開されているが、この拡張は、マルチターンのユーザ-エージェント-環境相互作用を利用して、単一ターンの設定で実現不可能な目的を達成する、長期的攻撃にそれらを公開する。このようなリスクに対するエージェントの脆弱性を測定するために,LLMエージェントの長期攻撃に対する感受性を評価するための最初のベンチマークであるAgentLABを提案する。現在、AgentLABはインテントハイジャック、ツールチェーン、タスクインジェクション、客観的ドリフト、メモリ中毒を含む5つの新しい攻撃タイプ、28の現実的なエージェント環境、644のセキュリティテストケースをサポートしている。 AgentLABを活用することで、代表的LDMエージェントを評価し、長期水平攻撃に対して高い影響を受け続けること、さらには、単一ターンインタラクション用に設計された防御が、長期水平脅威を確実に軽減することができないこと、を見出した。我々は,AgentLABがLLMエージェントの安全確保の進捗を追跡するための貴重なベンチマークとなることを期待する。ベンチマークはhttps://tanqiujiang.github.io/AgentLAB_mainで公開されている。

論文の概要: AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks

関連論文リスト