Fugu-MT 論文翻訳(概要): Activation Steering of Video Generation Models via Reduced-Order Linear Optimal Control

論文の概要: Activation Steering of Video Generation Models via Reduced-Order Linear Optimal Control

arxiv url: http://arxiv.org/abs/2606.04775v1
Date: Wed, 03 Jun 2026 11:58:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-04 20:44:18.726105
Title: Activation Steering of Video Generation Models via Reduced-Order Linear Optimal Control
Title（参考訳）: 低次線形最適制御による映像生成モデルの活性化ステアリング
Authors: Jihoon Hong, Alice Chan, Qiyue Dai, Julian Skifstad, Glen Chou,
Abstract要約: アクティベーションステアリングは、ファインチューニングやプロンプトフィルタリングに代わる魅力的な機構を提供する。 Latent Activation Linear-Quadratic Regulator (LA-LQR) は最小侵襲T2Vステアリングのための低次最適制御フレームワークである。 LA-LQRはT2V推論を力学系として定式化し、閉ループフィードバックの介入を計算する。
参考スコア（独自算出の注目度）: 3.3394856680250284
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-to-video (T2V) models trained on large-scale web data can generate undesired content, motivating interventions that reduce harmful outputs without sacrificing visual quality. Activation steering offers an attractive mechanistic alternative to finetuning and prompt filtering, but existing T2V steering methods remain limited, typically applying coarse, non-anticipative interventions that can lead to oversteering and content degradation. To close this gap, we propose Latent Activation Linear-Quadratic Regulator (LA-LQR), a reduced-order optimal control framework for minimally invasive T2V steering. LA-LQR formulates T2V inference as a dynamical system and computes closed-loop feedback interventions that steer activations toward desired feature setpoints while penalizing unnecessary perturbations. To make optimal control feasible for high-dimensional video activations, we project activations onto a low-dimensional, task-relevant subspace derived from contrastive prompt pairs, estimate local linear dynamics in this latent space, and solve a latent LQR problem to obtain timestep- and layer-specific steering signals. We provide theoretical bounds relating latent setpoint tracking to raw activation-space feature control, and empirically validate the fidelity of the reduced latent dynamics. On concept steering and video safety benchmarks, LA-LQR reduces unsafe generations relative to baselines, while preserving prompt fidelity and visual quality.
Abstract（参考訳）: 大規模なWebデータに基づいてトレーニングされたテキスト・ツー・ビデオ(T2V)モデルは、望ましくないコンテンツを生成し、視覚的品質を犠牲にすることなく有害な出力を減らすための介入を動機付ける。アクティベーションステアリングはファインチューニングやプロンプトフィルタリングに代わる魅力的な機構を提供するが、既存のT2Vステアリング法は限定的であり、通常は粗い非予想的な介入を適用し、オーバーステアリングやコンテンツ劣化を引き起こす。このギャップを埋めるため,最小侵襲T2Vステアリングのための低次最適制御フレームワークであるLA-LQRを提案する。 LA-LQRは、T2V推論を力学系として定式化し、不必要な摂動を罰しながら、所望の特徴セットポイントに向けて活性化を操るクローズドループフィードバックの介入を計算する。高次元映像のアクティベーションに最適な制御を実現するため、コントラスト的なプロンプトペアから導かれる低次元タスク関連部分空間にアクティベーションを投影し、この潜時空間における局所線形ダイナミクスを推定し、潜時LQR問題を解き、時間ステップと層固有のステアリング信号を得る。本稿では,潜在集合点追跡と生のアクティベーション空間の特徴制御に関する理論的バウンダリを提供し,低減された潜在動特性の忠実さを実証的に検証する。コンセプトステアリングとビデオ安全性ベンチマークでは、LA-LQRはベースラインに対する安全でない世代を減らし、迅速な忠実さと視覚的品質を保っている。

論文の概要: Activation Steering of Video Generation Models via Reduced-Order Linear Optimal Control

関連論文リスト