Fugu-MT 論文翻訳(概要): Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data

論文の概要: Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data

arxiv url: http://arxiv.org/abs/2509.23041v1
Date: Sat, 27 Sep 2025 01:39:41 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-30 22:32:19.000239
Title: Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data
Title（参考訳）: ウイルスによるLLMへの感染:「VIA」合成データを拡散できる
Authors: Zi Liang, Qingqing Ye, Xuan Liu, Yanyun Wang, Jianliang Xu, Haibo Hu,
Abstract要約: 本稿では,大規模言語モデルに対する合成データ統合学習のレジリエンスを定量的に評価する。本稿では,ウイルス感染攻撃(VIA)という,新規で普遍的な攻撃枠組みを導入する。
参考スコア（独自算出の注目度）: 24.21219815496624
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Synthetic data refers to artificial samples generated by models. While it has been validated to significantly enhance the performance of large language models (LLMs) during training and has been widely adopted in LLM development, potential security risks it may introduce remain uninvestigated. This paper systematically evaluates the resilience of synthetic-data-integrated training paradigm for LLMs against mainstream poisoning and backdoor attacks. We reveal that such a paradigm exhibits strong resistance to existing attacks, primarily thanks to the different distribution patterns between poisoning data and queries used to generate synthetic samples. To enhance the effectiveness of these attacks and further investigate the security risks introduced by synthetic data, we introduce a novel and universal attack framework, namely, Virus Infection Attack (VIA), which enables the propagation of current attacks through synthetic data even under purely clean queries. Inspired by the principles of virus design in cybersecurity, VIA conceals the poisoning payload within a protective "shell" and strategically searches for optimal hijacking points in benign samples to maximize the likelihood of generating malicious content. Extensive experiments on both data poisoning and backdoor attacks show that VIA significantly increases the presence of poisoning content in synthetic data and correspondingly raises the attack success rate (ASR) on downstream models to levels comparable to those observed in the poisoned upstream models.
Abstract（参考訳）: 合成データは、モデルによって生成された人工的なサンプルを指す。大規模な言語モデル(LLM)の性能をトレーニング中に大幅に向上することが検証され、LLM開発で広く採用されているが、それが導入する潜在的なセキュリティリスクは未検討のままである。本稿では, LLMに対する合成データ統合トレーニングのレジリエンスを, 主流の中毒やバックドア攻撃に対して系統的に評価する。このようなパラダイムは, 有毒データと合成サンプル生成に用いるクエリの分布パターンの違いから, 既存の攻撃に対して強い耐性を示すことが明らかとなった。これらの攻撃の有効性を高め、合成データによってもたらされるセキュリティリスクをさらに調査するため、クリーンなクエリの下でも、現在の攻撃の伝播を可能にする、新規で普遍的な攻撃フレームワークであるウイルス感染攻撃(VIA)を導入する。サイバーセキュリティにおけるウイルス設計の原則にインスパイアされたVIAは、悪意のあるサンプルの最適なハイジャックポイントを戦略的に探索し、悪意のあるコンテンツを生成する可能性を最大化する。データ中毒とバックドア攻撃の両方に関する大規模な実験により、VIAは合成データ中の中毒内容の存在を著しく増加させ、それに応じて下流モデルでの攻撃成功率(ASR)を上流モデルに匹敵するレベルに引き上げることが示された。

論文の概要: Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data

関連論文リスト