Fugu-MT 論文翻訳(概要): The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements

論文の概要: The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements

arxiv url: http://arxiv.org/abs/2606.12797v1
Date: Thu, 11 Jun 2026 01:46:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-12 15:55:27.528878
Title: The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements
Title（参考訳）: The Containment Gap: デプロイされたエージェントAIフレームワークが公衆の安全要件を損なう方法
Authors: Md Jafrin Hossain, Mohammad Arif Hossain, Weiqi Liu, Nirwan Ansari,
Abstract要約: エージェント型大規模言語モデルシステムは、パブリックドメインにますますデプロイされている。これらのシステムを構築するために使用されるフレームワークが、アーキテクチャレベルの構造的安全性を保証するかどうかを問う。
参考スコア（独自算出の注目度）: 4.431419229831417
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Agentic large language model systems that autonomously invoke tools, maintain persistent memory, and execute multi-step plans are increasingly deployed in public-facing domains, including government services, healthcare triage, and financial advising. We ask whether the frameworks used to build these systems provide architectural-level structural safety guarantees. Applying six containment principles derived from a compositional model of agentic architectures, we audit three dominant frameworks (LangChain, AutoGPT, and OpenAI Agents SDK) and find no native compliance in any of them. Memory integrity, a defense against one of the most prevalent vulnerability classes, is not observed in any of the three evaluated frameworks. We validate these findings empirically: in a simulated government benefits agent built on LangChain, a single memory-poisoning write induces persistent targeted corruption across all tested seeds and backends, increasing the wrongful denial rate for targeted applicants to 88.9%. Under a complex five-factor policy, the same attack preserves aggregate accuracy while increasing targeted wrongful denials by 3.5x, rendering the corruption difficult to detect through standard monitoring. We then introduce two lightweight containment mechanisms: a memory integrity validator and a policy gate, which eliminate both attack vectors with sub-millisecond overhead (<0.2ms per call). We conclude that the current agentic framework ecosystem may not yet meet secure-by-default expectations for public-facing deployments and outline priority architectural interventions to enable trustworthy deployment in high-stakes, socially impactful applications.
Abstract（参考訳）: ツールを自律的に起動し、永続的なメモリを維持し、多段階計画を実行するエージェント型大規模言語モデルシステムは、政府サービス、医療トリアージ、金融アドバイスなど、公共向けドメインにますます多くデプロイされている。これらのシステムを構築するために使用されるフレームワークが、アーキテクチャレベルの構造的安全性を保証するかどうかを問う。エージェントアーキテクチャの構成モデルから導かれる6つの封じ込め原則を適用し、LangChain、AutoGPT、OpenAI Agents SDKの3つの支配的なフレームワークを監査し、そのどれにもネイティブなコンプライアンスは見つからない。最も一般的な脆弱性クラスのひとつに対する防御であるメモリ完全性は、評価された3つのフレームワークのいずれかで観察されていない。 LangChain上に構築されたシミュレートされた政府給付エージェントでは、単一のメモリポゾンによる書き込みは、テストされたすべてのシードとバックエンドにわたって永続的なターゲットの汚職を誘導し、ターゲットの応募者に対する誤った否定率を88.9%に向上させる。複雑な5要素ポリシーの下では、同じ攻撃は集計精度を保ちながら、ターゲットの不正否定を3.5倍に増やし、標準的な監視によって汚職を検出するのが難しくなる。次に、メモリ整合性検証器とポリシーゲートという2つの軽量な封じ込め機構を導入し、これは2つの攻撃ベクトルをミリ秒以下のオーバーヘッドで除去する(呼び出し毎に0.2ms)。現在のエージェントフレームワークエコシステムは、パブリックなデプロイメントに対して、セキュアかつデフォルトな期待をまだ満たしていない可能性がある、と結論付け、信頼性の高い、社会的に影響のあるアプリケーションへのデプロイを可能にするために、アーキテクチャの優先的な介入を概説する。

論文の概要: The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements

関連論文リスト