Fugu-MT 論文翻訳(概要): DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

論文の概要: DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

arxiv url: http://arxiv.org/abs/2606.01212v1
Date: Sun, 31 May 2026 13:03:47 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-02 21:34:29.387116
Title: DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation
Title（参考訳）: DiscourseFlip: Black-box Retrieval-Augmented Generationに対する斜めの談話レベルオピニオン操作攻撃
Authors: Yuyang Gong, Miaokun Chen, Jiawei Liu, Zhuo Chen, Guoxiu He, Wei Lu, XiaoFeng Wang, Xiaozhong Liu,
Abstract要約: 既存のRAG攻撃は主に個々のクエリや狭いトピックローカルクエリセットに焦点を当てている。セマンティック・クエリー・ネットワークにまたがる協調的な影響が意見シフトを引き起こす新たな脅威モデルである談話レベルの意見操作を導入する。実験では、DiscourseFlipがコンテキスト化されたクエリネットワークをまたいで、目標とする意見シフトを一貫して誘導することを示した。
参考スコア（独自算出の注目度）: 29.953161235840188
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retrieval-Augmented Generation (RAG) systems are widely deployed and increasingly influential, but their reliance on external corpora exposes new security risks from poisoned retrieval content. Existing RAG attacks are largely focusing on individual queries or narrow topic-local query sets, which limits their practical reach and offers limited camouflage in real-world settings. In this paper, we introduce discourse-level opinion manipulation, a new threat model in which coordinated influence across a semantic query network induces opinion shifts over a holistic, multi-topic query space. We formalize this threat in a black-box setting and propose DiscourseFlip, an agentic, graph-guided attack that dynamically allocates a limited poisoning budget to maximize discourse-level opinion deviation. Extensive experiments demonstrate that DiscourseFlip consistently induces targeted opinion shifts across the contextualized query network and significantly outperforms existing baselines in terms of coverage and effectiveness. User studies further confirm that DiscourseFlip is effective while remaining well camouflaged from user detection. Moreover, systematic analyses show that existing mitigation strategies are ineffective against discourse-level manipulation, underscoring the urgent need for more robust and adaptive defenses to address discourse-level vulnerabilities.
Abstract（参考訳）: Retrieval-Augmented Generation (RAG) システムは広く展開され、影響力を増しているが、外部コーパスへの依存は、有毒な検索コンテンツから新たなセキュリティリスクを露呈する。既存のRAG攻撃は、個々のクエリや狭いトピックローカルクエリセットに重点を置いている。本稿では,セマンティック・クエリ・ネットワークにまたがる協調的な影響が,総合的なマルチトピック・クエリ・スペース上での意見シフトを誘導する新たな脅威モデルである,談話レベルの意見操作を導入する。我々は、この脅威をブラックボックス設定で形式化し、ディスコースレベルの意見偏差を最大化するために、限定的な毒殺予算を動的に割り当てるエージェント的グラフ誘導攻撃であるディスコースフリップを提案する。広範な実験により、DiscourseFlipは、コンテキスト化されたクエリネットワーク全体にわたって、目標とする意見シフトを一貫して誘導し、カバー範囲と有効性の観点から、既存のベースラインを大幅に上回っていることが示される。ユーザスタディでは、DiscourseFlipが有効であると同時に、ユーザ検出から十分なキャモフラージュを保っていることが確認されている。さらに,既存の緩和戦略は談話レベルの操作に対して効果がないことを示し,談話レベルの脆弱性に対処するためのより堅牢で適応的な防御の必要性を浮き彫りにしている。

論文の概要: DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

関連論文リスト