Fugu-MT 論文翻訳(概要): CtrlAttack: A Unified Attack on World-Model Control in Diffusion Models

論文の概要: CtrlAttack: A Unified Attack on World-Model Control in Diffusion Models

arxiv url: http://arxiv.org/abs/2603.13435v1
Date: Fri, 13 Mar 2026 08:05:50 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 16:19:35.203566
Title: CtrlAttack: A Unified Attack on World-Model Control in Diffusion Models
Title（参考訳）: CtrlAttack: 拡散モデルにおける世界モデル制御の統一攻撃
Authors: Shuhan Xu, Siyuan Liang, Hongling Zheng, Yong Luo, Han Hu, Lefei Zhang, Dacheng Tao,
Abstract要約: 我々は、I2Vモデルの脆弱性を分析し、時間的制御機構が新たな攻撃面を構成することを発見し、それらを一様にモデル化することの難しさを明らかにする。我々はCtrlAttackと呼ばれるトラジェクトリ制御攻撃を提案し、生成過程における状態の進化を妨害する。実験結果から,低次元および高規則化摂動制約下であっても,時間的一貫性を著しく損なう可能性が示唆された。
参考スコア（独自算出の注目度）: 92.04182855442254
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Diffusion-based image-to-video (I2V) models increasingly exhibit world-model-like properties by implicitly capturing temporal dynamics. However, existing studies have mainly focused on visual quality and controllability, and the robustness of the state transition learned by the model remains understudied. To fill this gap, we are the first to analyze the vulnerability of I2V models, find that temporal control mechanisms constitute a new attack surface, and reveal the challenge of modeling them uniformly under different attack settings. Based on this, we propose a trajectory-control attack, called CtrlAttack, to interfere with state evolution during the generation process. Specifically, we represent the perturbation as a low-dimensional velocity field and construct a continuous displacement field via temporal integration, thereby affecting the model's state transitions while maintaining temporal consistency; meanwhile, we map the perturbation to the observation space, making the method applicable to both white-box and black-box attack settings. Experimental results show that even under low-dimensional and strongly regularized perturbation constraints, our method can still significantly disrupt temporal consistency by increasing the attack success rate (ASR) to over 90% in the white-box setting and over 80% in the black-box setting, while keeping the variation of the FID and FVD within 6 and 130, respectively, thus revealing the potential security risk of I2V models at the level of state dynamics.
Abstract（参考訳）: 拡散に基づくイメージ・トゥ・ビデオ(I2V)モデルでは、時間的ダイナミクスを暗黙的に捉えることで、世界モデルのような特性がますます現れている。しかし、既存の研究は主に視覚的品質と制御性に焦点を当てており、モデルによって学習された状態遷移の堅牢性はいまだ検討されていない。このギャップを埋めるために、我々は初めてI2Vモデルの脆弱性を分析し、時間的制御機構が新たな攻撃面を構成することを発見し、異なる攻撃条件下でそれらを一様にモデル化することの難しさを明らかにする。そこで本研究では,CtrlAttackと呼ばれるトラジェクティブ・コントロール・アタックを提案する。具体的には、摂動を低次元の速度場として表現し、時間的積分による連続変位場を構築し、時間的整合性を維持しながらモデルの状態遷移に影響を与える。実験の結果,低次元および高規則化摂動制約下であっても,攻撃成功率(ASR)を90%以上,ブラックボックス設定の80%以上に増加させることで時間的整合性を著しく損なうことができ,FIDとFVDの変動を6と130に抑えながら,状態力学レベルにおけるI2Vモデルの潜在的なセキュリティリスクを明らかにすることができることがわかった。

論文の概要: CtrlAttack: A Unified Attack on World-Model Control in Diffusion Models

関連論文リスト