Fugu-MT 論文翻訳(概要): AgentSys: Secure and Dynamic LLM Agents Through Explicit Hierarchical Memory Management

論文の概要: AgentSys: Secure and Dynamic LLM Agents Through Explicit Hierarchical Memory Management

arxiv url: http://arxiv.org/abs/2602.07398v1
Date: Sat, 07 Feb 2026 06:28:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-10 20:26:24.602487
Title: AgentSys: Secure and Dynamic LLM Agents Through Explicit Hierarchical Memory Management
Title（参考訳）: AgentSys: 階層型メモリ管理によるセキュアかつ動的LLMエージェント
Authors: Ruoyao Wen, Hao Li, Chaowei Xiao, Ning Zhang,
Abstract要約: 既存の防御は肥大した記憶を与えられたまま扱い、回復力を維持することに集中する。我々は、明示的なメモリ管理を通じて間接的なインジェクションを防御するフレームワークであるAgentSysを紹介する。
参考スコア（独自算出の注目度）: 47.49917373646469
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Indirect prompt injection threatens LLM agents by embedding malicious instructions in external content, enabling unauthorized actions and data theft. LLM agents maintain working memory through their context window, which stores interaction history for decision-making. Conventional agents indiscriminately accumulate all tool outputs and reasoning traces in this memory, creating two critical vulnerabilities: (1) injected instructions persist throughout the workflow, granting attackers multiple opportunities to manipulate behavior, and (2) verbose, non-essential content degrades decision-making capabilities. Existing defenses treat bloated memory as given and focus on remaining resilient, rather than reducing unnecessary accumulation to prevent the attack. We present AgentSys, a framework that defends against indirect prompt injection through explicit memory management. Inspired by process memory isolation in operating systems, AgentSys organizes agents hierarchically: a main agent spawns worker agents for tool calls, each running in an isolated context and able to spawn nested workers for subtasks. External data and subtask traces never enter the main agent's memory; only schema-validated return values can cross boundaries through deterministic JSON parsing. Ablations show isolation alone cuts attack success to 2.19%, and adding a validator/sanitizer further improves defense with event-triggered checks whose overhead scales with operations rather than context length. On AgentDojo and ASB, AgentSys achieves 0.78% and 4.25% attack success while slightly improving benign utility over undefended baselines. It remains robust to adaptive attackers and across multiple foundation models, showing that explicit memory management enables secure, dynamic LLM agent architectures. Our code is available at: https://github.com/ruoyaow/agentsys-memory.
Abstract（参考訳）: 間接的なプロンプトインジェクションは、悪意のある命令を外部コンテンツに埋め込むことでLLMエージェントを脅かす。 LLMエージェントは、コンテキストウィンドウを通じて作業メモリを保持し、決定のためのインタラクション履歴を格納する。従来のエージェントは、このメモリにすべてのツール出力とトレースを無差別に蓄積し、2つの重大な脆弱性を生成します。既存の防御は、攻撃を防ぐために不要な蓄積を減らすのではなく、肥大した記憶を与えられたまま扱い、回復力を維持することに集中する。我々は、明示的なメモリ管理を通じて間接的なインジェクションを防御するフレームワークであるAgentSysを紹介する。メインエージェントはツールコールのためにワーカーエージェントを発生させ、それぞれが独立したコンテキストで実行され、サブタスクのためにネストされたワーカーを発生させることができる。外部データとサブタスクトレースはメインエージェントのメモリに決して入らない。決定論的JSON解析を通じてスキーマ検証された戻り値だけが境界を越えることができる。アブレーションは、アイソレーションだけで攻撃の成功を2.19%に削減し、バリデータ/サニタイザを追加することで、コンテキスト長ではなく操作によってオーバーヘッドがスケールするイベントトリガー付きチェックによる防御をさらに改善することを示している。 AgentDojoとASBでは、AgentSysは0.78%と4.25%の攻撃成功を達成すると同時に、修正されていないベースラインよりも良質なユーティリティをわずかに改善している。適応攻撃者や複数の基盤モデルに対して堅牢であり、明示的なメモリ管理によってセキュアで動的LLMエージェントアーキテクチャが実現可能であることを示している。私たちのコードは、https://github.com/ruoyaow/agentsys-Memoryで利用可能です。

論文の概要: AgentSys: Secure and Dynamic LLM Agents Through Explicit Hierarchical Memory Management

関連論文リスト