Fugu-MT 論文翻訳(概要): Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals

論文の概要: Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals

arxiv url: http://arxiv.org/abs/2510.27684v1
Date: Fri, 31 Oct 2025 17:55:10 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-03 17:52:16.196562
Title: Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals
Title（参考訳）: フェーズDMD: サブインターバル内におけるスコアマッチングによる数ステップの分散マッチング
Authors: Xiangyu Fan, Zesong Qiu, Zhuguanyu Wu, Fanzhou Wang, Zhiqian Lin, Tianxiang Ren, Dahua Lin, Ruihao Gong, Lei Yang,
Abstract要約: フェーズドDMDは、Mixture-of-Expertsでフェーズワイド蒸留のアイデアを橋渡しする多段階蒸留フレームワークである。位相MDDはプログレッシブな分布マッチングとサブインターバル内のスコアマッチングという2つの主要なアイデアに基づいて構築されている。実験結果から,第2相DMDはDMDよりも出力の多様性を保ちつつ,重要な生成能力を保っていることが明らかとなった。
参考スコア（独自算出の注目度）: 48.14879329270912
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Distribution Matching Distillation (DMD) distills score-based generative models into efficient one-step generators, without requiring a one-to-one correspondence with the sampling trajectories of their teachers. However, limited model capacity causes one-step distilled models underperform on complex generative tasks, e.g., synthesizing intricate object motions in text-to-video generation. Directly extending DMD to multi-step distillation increases memory usage and computational depth, leading to instability and reduced efficiency. While prior works propose stochastic gradient truncation as a potential solution, we observe that it substantially reduces the generation diversity of multi-step distilled models, bringing it down to the level of their one-step counterparts. To address these limitations, we propose Phased DMD, a multi-step distillation framework that bridges the idea of phase-wise distillation with Mixture-of-Experts (MoE), reducing learning difficulty while enhancing model capacity. Phased DMD is built upon two key ideas: progressive distribution matching and score matching within subintervals. First, our model divides the SNR range into subintervals, progressively refining the model to higher SNR levels, to better capture complex distributions. Next, to ensure the training objective within each subinterval is accurate, we have conducted rigorous mathematical derivations. We validate Phased DMD by distilling state-of-the-art image and video generation models, including Qwen-Image (20B parameters) and Wan2.2 (28B parameters). Experimental results demonstrate that Phased DMD preserves output diversity better than DMD while retaining key generative capabilities. We will release our code and models.
Abstract（参考訳）: 分散マッチング蒸留(DMD)は、教師のサンプリング軌跡と1対1の対応を必要とせず、スコアベースの生成モデルを効率的なワンステップジェネレータに蒸留する。しかし、限られたモデル容量は、複雑な生成タスク、例えばテキストからビデオ生成における複雑な物体の動きの合成において、1段階の蒸留モデルの性能を低下させる。 DMDを多段階蒸留へ直接拡張すると、メモリ使用量と計算深度が増加し、不安定性と効率が低下する。従来の研究では, 確率勾配トルーニングを潜在的な解として提案していたが, 多段階蒸留モデルの生成の多様性を著しく低減し, 一段階蒸留モデルのレベルまで低下させることが観察された。これらの制約に対処するため,Mixture-of-Experts (MoE) を用いた多段階蒸留フレームワークである Phased DMD を提案する。位相MDDはプログレッシブな分布マッチングとサブインターバル内のスコアマッチングという2つの主要なアイデアに基づいて構築されている。まず、SNRの範囲をサブインターバルに分割し、より高いSNRレベルに徐々に精製し、複雑な分布をよりよく捉える。次に,各サブインターバル内のトレーニング目標が正確であることを確かめるために,厳密な数学的導出を行った。我々は、Qwen-Image (20Bパラメータ) やWan2.2 (28Bパラメータ) など、最先端の画像モデルとビデオ生成モデルを蒸留して位相DMDを検証する。実験結果から,第2相DMDはDMDよりも出力の多様性を保ちつつ,重要な生成能力を保っていることが明らかとなった。コードとモデルをリリースします。

論文の概要: Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals

関連論文リスト