Fugu-MT 論文翻訳(概要): Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

論文の概要: Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

arxiv url: http://arxiv.org/abs/2605.28017v2
Date: Thu, 28 May 2026 01:52:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-30 00:00:30.926275
Title: Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings
Title（参考訳）: 発電機は可能か? : 現実的なRAG設定におけるプロンプト注入攻撃の生存について
Authors: Yu Yin, Shuai Wang, Bevan Koopman, Guido Zuccon,
Abstract要約: 7つのGEO攻撃を現実的な3段階パイプラインで再評価する。従来のプロトコルでは,攻撃の有効性が著しく過大評価されていた。小さなアタックデータセットに微調整された軽量のプロンプトインジェクションガードは、すでにすべてのアタックを検出している。
参考スコア（独自算出の注目度）: 40.03039307576983
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent generative engine optimisation (GEO) research has shown that prompt-injection attacks can push a target product to the top of an LLM's recommendation list, with the strongest attacks reporting around $80\%$ success and raising serious security concerns about RAG-based recommendation. However, these results assume the attacked document is always fed directly to the generator, bypassing the retriever and reranker. This is unrealistic: in deployed RAG systems, the attack modifies the document content, which can in turn change whether the document is retrieved and reranked highly enough to reach the generator at all. In this paper, we re-evaluate seven GEO attacks under a realistic three-stage pipeline (retriever\,$\to$\,LLM reranker\,$\to$\,LLM generator). We find that prior protocols substantially overstate attack effectiveness: gradient-based and instruction override attacks largely collapse before reaching the generator, and only LLM-driven prompt injections remain effective end-to-end. Our analysis further reveals that current GEO attacks are easily detectable: a lightweight prompt-injection guard finetuned on a small attack dataset already detects every attack. Our code and data are available at https://github.com/ielab/geo_injection_rag_survival.
Abstract（参考訳）: 最近のジェネレーティブエンジン最適化(GEO)研究は、迅速なインジェクション攻撃は、目標製品をLLMレコメンデーションリストのトップに押し上げることができ、最も強力な攻撃は80$%の成功を報告し、RAGベースのレコメンデーションに関する深刻なセキュリティ上の懸念を提起している。しかし、これらの結果は攻撃されたドキュメントが常にジェネレータに直接供給され、レトリバーとリランカをバイパスすると仮定する。これは非現実的であり、デプロイされたRAGシステムでは、攻撃は文書の内容を変更する。本稿では,現実的な3段階パイプライン(retriever\,$\to$\,LLM reranker\,$\to$\,LLM generator)の下でGEO攻撃7件を再評価する。グラデーションベースおよび命令オーバーライド攻撃は、ジェネレータに到達する前に大きく崩壊し、LSM駆動のプロンプトインジェクションのみが効果的にエンド・ツー・エンドのままである。我々の分析は、現在のGEO攻撃が容易に検出できることを明らかにしている。私たちのコードとデータはhttps://github.com/ielab/geo_injection_rag_survival.comで公開されています。

論文の概要: Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

関連論文リスト