Fugu-MT 論文翻訳(概要): CoDMD: Copula-aware Distribution Matching Distillation for Fast Video Generation

論文の概要: CoDMD: Copula-aware Distribution Matching Distillation for Fast Video Generation

arxiv url: http://arxiv.org/abs/2606.21982v1
Date: Sat, 20 Jun 2026 10:33:14 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-25 23:26:01.118906
Title: CoDMD: Copula-aware Distribution Matching Distillation for Fast Video Generation
Title（参考訳）: CoDMD:Copula-aware Distribution Matching Distillation for Fast Video Generation
Authors: Wenhu Zhang, Kun Cheng, Changyuan Wang, Shiyao Li, Yuechen Zhang, Wenbo Li, Jiajun Zha, Jingyi Zhang, Kang Zhao, Jiaya Jia,
Abstract要約: 実世界のシナリオにおける効率的な展開の急激な需要により,ビデオ拡散モデルの蒸留が注目されている。我々は,凍結教師とオンライン偽モデルによって既に生成されているスコア推定を再利用し,対関係行列を構成する軽量リレーショナルレギュレータであるCopula-Aware DMD(CoDMD)を提案する。 1.3Bと14BスケールのWan-2.1-T2Vモデルシリーズで、CoDMDは50ステップの教師を4ステップの学生に蒸留し、VBenchスコア84.46と84.87を達成しながら、およそ25$times$ Speed-upを達成した。
参考スコア（独自算出の注目度）: 50.353919095724315
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Few-step distillation for video diffusion models has attracted significant attention, driven by the urgent demand for efficient deployment in real-world scenarios. However, Distribution Matching Distillation (DMD), a leading paradigm, tends to degrade under limited NFE budgets, manifesting in video generation as layout instability, oversaturation, and broken motion dynamics. We trace this failure to a structural limitation: standard DMD is an intra-sample distribution-matching objective with coordinate-wise gradients, and thus imposes no explicit constraint on the relational geometry across batch elements or temporal frames, leaving the underlying copula largely unregulated. Combined with the mode-seeking tendency of its reverse-KL objective, this absence of relational guidance makes DMD prone to collapsing into local optima in the few-step regime. Motivated by this insight, we propose Copula-aware DMD (CoDMD), a lightweight relational regularizer that reuses score estimates already produced by the frozen teacher and the online fake model to construct pairwise relation matrices across samples and frames. These are matched through a supplementary distributional objective that requires no additional networks, datasets, or sampling trajectories. On the Wan-2.1-T2V model series at 1.3B & 14B scales, CoDMD distills 50-step teachers into 4-step students, achieving an approximate 25$\times$ speed-up while attaining VBench scores of 84.46 & 84.87, outperforming prior trajectory-based (rCM 82.81 & 84.05) and distribution-based (DMD 83.38 & 83.81) methods.
Abstract（参考訳）: 実世界のシナリオにおける効率的な展開の急激な需要により,ビデオ拡散モデルの蒸留が注目されている。しかし、主要なパラダイムであるDis Distribution Matching Distillation (DMD)は、限られたNFE予算の下で劣化する傾向にあり、ビデオ生成ではレイアウト不安定、過飽和、破壊運動力学として現れる。標準MDDは座標的な勾配を持つサンプル内分布マッチングの目的であり、したがってバッチ要素や時間的フレーム間の関係幾何学に明示的な制約を課さず、基礎となるコプラはほとんど制御されていない。逆KL目標のモード探索傾向と組み合わさって、リレーショナルガイダンスの欠如により、MDDは数段階体制において局所最適状態に崩壊する傾向にある。この知見に触発されて,凍結教師が既に生み出したスコア推定とオンライン偽モデルを用いて,サンプルとフレーム間の相互関係行列を構築する軽量リレーショナルレギュレータであるCopula-aware DMD(CoDMD)を提案する。これらは、追加のネットワーク、データセット、サンプリングトラジェクトリを必要としない追加の分散目的によって一致します。 1.3Bと14BスケールのWan-2.1-T2Vモデルシリーズでは、50ステップの教師を4ステップの学生に蒸留し、VBenchスコア84.46と84.87を達成しながら25$\times$スピードアップを達成した。

論文の概要: CoDMD: Copula-aware Distribution Matching Distillation for Fast Video Generation

関連論文リスト