Fugu-MT 論文翻訳(概要): FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance

論文の概要: FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance

arxiv url: http://arxiv.org/abs/2603.12146v1
Date: Thu, 12 Mar 2026 16:45:53 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-13 14:46:26.226763
Title: FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance
Title（参考訳）: FlashMotion:軌道誘導機能付きビデオ生成機能
Authors: Quanhao Li, Zhen Xing, Rui Wang, Haidong Cao, Qi Dai, Daoguo Dong, Zuxuan Wu,
Abstract要約: FlashMotionは、数ステップの軌道制御が可能なビデオ生成用に設計されたトレーニングフレームワークである。まず,多段ビデオジェネレータにトラジェクタアダプタをトレーニングし,正確なトラジェクタ制御を行う。次に, 生成装置を数段階に蒸留し, 映像生成を高速化する。最後に,拡散と敵対的目的を組み合わせたハイブリッド戦略を用いて,アダプタを微調整する。
参考スコア（独自算出の注目度）: 53.696862625103144
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in trajectory-controllable video generation have achieved remarkable progress. Previous methods mainly use adapter-based architectures for precise motion control along predefined trajectories. However, all these methods rely on a multi-step denoising process, leading to substantial time redundancy and computational overhead. While existing video distillation methods successfully distill multi-step generators into few-step, directly applying these approaches to trajectory-controllable video generation results in noticeable degradation in both video quality and trajectory accuracy. To bridge this gap, we introduce FlashMotion, a novel training framework designed for few-step trajectory-controllable video generation. We first train a trajectory adapter on a multi-step video generator for precise trajectory control. Then, we distill the generator into a few-step version to accelerate video generation. Finally, we finetune the adapter using a hybrid strategy that combines diffusion and adversarial objectives, aligning it with the few-step generator to produce high-quality, trajectory-accurate videos. For evaluation, we introduce FlashBench, a benchmark for long-sequence trajectory-controllable video generation that measures both video quality and trajectory accuracy across varying numbers of foreground objects. Experiments on two adapter architectures show that FlashMotion surpasses existing video distillation methods and previous multi-step models in both visual quality and trajectory consistency.
Abstract（参考訳）: トラジェクティブ制御可能なビデオ生成の最近の進歩は、目覚ましい進歩を遂げている。従来の手法は主に、予め定義された軌道に沿って正確な動き制御を行うためにアダプタベースのアーキテクチャを用いていた。しかし、これらの手法はすべてマルチステップのデノナイジングプロセスに依存しており、時間冗長性と計算オーバーヘッドが相当に大きくなる。既存のビデオ蒸留法は多段発生器を数段階に蒸留することに成功したが、これらの手法を直接トラジェクトリ制御可能なビデオ生成に適用することで、ビデオ品質とトラジェクトリ精度の両方が著しく低下する。このギャップを埋めるために、数ステップの軌道制御が可能なビデオ生成用に設計された新しいトレーニングフレームワーク、FlashMotionを紹介する。まず,多段ビデオジェネレータにトラジェクタアダプタをトレーニングし,正確なトラジェクタ制御を行う。次に, 生成装置を数段階に蒸留し, 映像生成を高速化する。最後に、拡散と敵の目的を組み合わせたハイブリッド戦略を用いてアダプタを微調整し、数ステップのジェネレータと整列させて高品質で軌道精度の高いビデオを生成する。評価のために,FlashBenchという長周期トラジェクトリ制御可能なビデオ生成用ベンチマークを導入し,様々な前景オブジェクトの映像品質とトラジェクトリ精度を計測した。 2つのアダプタアーキテクチャの実験により、FlashMotionは既存のビデオ蒸留法と、視覚的品質と軌道整合性の両方において、従来の多段階モデルを上回ることが示されている。

論文の概要: FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance

関連論文リスト