Fugu-MT 論文翻訳(概要): ObliInjection: Order-Oblivious Prompt Injection Attack to LLM Agents with Multi-source Data

論文の概要: ObliInjection: Order-Oblivious Prompt Injection Attack to LLM Agents with Multi-source Data

arxiv url: http://arxiv.org/abs/2512.09321v3
Date: Mon, 15 Dec 2025 04:21:36 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-16 15:10:29.154314
Title: ObliInjection: Order-Oblivious Prompt Injection Attack to LLM Agents with Multi-source Data
Title（参考訳）: オーブリインジェクション:マルチソースデータを用いたLDMエージェントへの秩序なプロンプトインジェクション
Authors: Reachal Wang, Yuqi Jia, Neil Zhenqiang Gong,
Abstract要約: Obliinjectionは、LLMの入力データを汚染して、意図したタスクではなく、アタッカー・チョーゼンタスクを完了させる。既存のプロンプトインジェクション攻撃は、入力データ全体が攻撃者の制御下にある単一のソースから来ていると仮定するか、異なるソースからのセグメントの順序の不確実性を無視するかのいずれかである。 Obliinjectionは2つの重要な技術革新を導入している。
参考スコア（独自算出の注目度）: 37.94746388564456
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prompt injection attacks aim to contaminate the input data of an LLM to mislead it into completing an attacker-chosen task instead of the intended task. In many applications and agents, the input data originates from multiple sources, with each source contributing a segment of the overall input. In these multi-source scenarios, an attacker may control only a subset of the sources and contaminate the corresponding segments, but typically does not know the order in which the segments are arranged within the input. Existing prompt injection attacks either assume that the entire input data comes from a single source under the attacker's control or ignore the uncertainty in the ordering of segments from different sources. As a result, their success is limited in domains involving multi-source data. In this work, we propose ObliInjection, the first prompt injection attack targeting LLM applications and agents with multi-source input data. ObliInjection introduces two key technical innovations: the order-oblivious loss, which quantifies the likelihood that the LLM will complete the attacker-chosen task regardless of how the clean and contaminated segments are ordered; and the orderGCG algorithm, which is tailored to minimize the order-oblivious loss and optimize the contaminated segments. Comprehensive experiments across three datasets spanning diverse application domains and twelve LLMs demonstrate that ObliInjection is highly effective, even when only one out of 6-100 segments in the input data is contaminated. Our code and data are available at: https://github.com/ReachalWang/ObliInjection.
Abstract（参考訳）: プロンプトインジェクション攻撃は、LLMの入力データを汚染して、意図したタスクではなくアタッカー・チョーゼンタスクを完了させる。多くのアプリケーションやエージェントでは、入力データは複数のソースから始まり、各ソースは全体の入力の一部に寄与する。これらのマルチソースシナリオでは、攻撃者はソースのサブセットだけを制御し、対応するセグメントを汚染するが、通常、セグメントが入力内に配置される順序を知らない。既存のプロンプトインジェクション攻撃は、入力データ全体が攻撃者の制御下にある単一のソースから来ていると仮定するか、異なるソースからのセグメントの順序の不確実性を無視するかのいずれかである。その結果、その成功はマルチソースデータを含む領域に限られる。本研究では, LLM アプリケーションとマルチソース入力データを用いたエージェントを対象とした最初のプロンプトインジェクション攻撃である Obli Injection を提案する。 Obliinjectionは、2つの重要な技術革新を紹介している。これは、LLMが、クリーンで汚染されたセグメントの順序によらず、アタッカー・チョーゼンタスクを完了させる可能性の定量化であり、オーダー・オブ・ブリビアス・ロスを最小限に抑え、汚染されたセグメントを最適化するために調整されたオーダーGCGアルゴリズムである。多様なアプリケーションドメインと12のLCMにまたがる3つのデータセットにわたる総合的な実験は、入力データ中の6～100セグメントのうち1つだけが汚染されている場合でも、Obli注入が極めて効果的であることを示した。私たちのコードとデータは、https://github.com/ReachalWang/Obliinjection.comで利用可能です。

論文の概要: ObliInjection: Order-Oblivious Prompt Injection Attack to LLM Agents with Multi-source Data

関連論文リスト