Fugu-MT 論文翻訳(概要): LAAF: Logic-layer Automated Attack Framework A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems

論文の概要: LAAF: Logic-layer Automated Attack Framework A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems

arxiv url: http://arxiv.org/abs/2603.17239v1
Date: Wed, 18 Mar 2026 00:51:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-19 18:32:57.460278
Title: LAAF: Logic-layer Automated Attack Framework A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems
Title（参考訳）: LAAF: 論理層自動攻撃フレームワークエージェント型大規模言語モデルシステムにおけるLPCI脆弱性に対するシステム的再チーム化手法
Authors: Hammad Atta, Ken Huang, Kyriakos Rock Lambros, Yasir Mehmood, Zeeshan Baig, Mohamed Abdur Rahman, Manish Bhatt, M. Aziz Ul Haq, Muhammad Aatif, Nadeem Shahzad, Kamal Noor, Vineeth Sai Narajala, Hazem Ali, Jamel Abed,
Abstract要約: LAAFは、LPCI固有のテクニックと段階的なシードエスカレーションを組み合わせた最初の自動化赤チームフレームワークである。 LAAFは単技術ランダムテストよりも高いステージブレークスルー効率が得られることを示す。
参考スコア（独自算出の注目度）: 0.39875976220956705
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Agentic LLM systems equipped with persistent memory, RAG pipelines, and external tool connectors face a class of attacks - Logic-layer Prompt Control Injection (LPCI) - for which no automated red-teaming instrument existed. We present LAAF (Logic-layer Automated Attack Framework), the first automated red-teaming framework to combine an LPCI-specific technique taxonomy with stage-sequential seed escalation - two capabilities absent from existing tools: Garak lacks memory-persistence and cross-session triggering; PyRIT supports multi-turn testing but treats turns independently, without seeding each stage from the prior breakthrough. LAAF provides: (i) a 49-technique taxonomy spanning six attack categories (Encoding~11, Structural~8, Semantic~8, Layered~5, Trigger~12, Exfiltration~5; see Table 1), combinable across 5 variants per technique and 6 lifecycle stages, yielding a theoretical maximum of 2,822,400 unique payloads ($49 \times 5 \times 1{,}920 \times 6$; SHA-256 deduplicated at generation time); and (ii) a Persistent Stage Breaker (PSB) that drives payload mutation stage-by-stage: on each breakthrough, the PSB seeds the next stage with a mutated form of the winning payload, mirroring real adversarial escalation. Evaluation on five production LLM platforms across three independent runs demonstrates that LAAF achieves higher stage-breakthrough efficiency than single-technique random testing, with a mean aggregate breakthrough rate of 84\% (range 83--86\%) and platform-level rates stable within 17 percentage points across runs. Layered combinations and semantic reframing are the highest-effectiveness technique categories, with layered payloads outperforming encoding on well-defended platforms.
Abstract（参考訳）: 永続メモリ、RAGパイプライン、外部ツールコネクタを備えたエージェントLLMシステムは、自動リピート装置が存在しない、論理層プロンプト制御インジェクション(LPCI)と呼ばれる一連の攻撃に直面している。 LAAF(Logic-layer Automated Attack Framework)は、LPCI固有のテクニックとステージシークエンシャルなシードエスカレーションを組み合わせた最初の自動化赤チームフレームワークである。 LAAFが提供します。 (i)6つの攻撃カテゴリ(エンコーディング〜11, 構造〜8, セマンティック〜8, レイヤー〜5, トリガー〜12, エクスフィル〜5; 表1)にまたがる49の技術的分類で、技術毎の5つの変種と6つのライフサイクルステージに結合可能で、理論的最大2,822,400個のユニークなペイロード(49 \times 5 \times 1{,}920 \times 6$; SHA-256 deduplicated)を出力する。 (ii) ペイロードの突然変異を段階ごとに駆動する永続ステージブレーカー(PSB) - 各ブレークスルーにおいて、PSBは勝利ペイロードの変異形で次のステージにシードし、実際の敵エスカレーションを反映する。 3つの独立ランにわたる5つのLLMプラットフォームの評価は、LAAFが1つの技術的ランダムテストよりも高いステージブレークスルー効率を達成し、平均総ブレークスルーレートは84\%(範囲83-86\%)、プラットフォームレベルレートは17ポイント以内に安定していることを示している。階層化組み合わせとセマンティックリフレーミングは最も効果的なテクニックカテゴリであり、層化ペイロードはよく定義されたプラットフォームでのエンコーディングよりも優れています。

論文の概要: LAAF: Logic-layer Automated Attack Framework A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems

関連論文リスト