Fugu-MT 論文翻訳(概要): Score Distillation of Flow Matching Models

論文の概要: Score Distillation of Flow Matching Models

arxiv url: http://arxiv.org/abs/2509.25127v1
Date: Mon, 29 Sep 2025 17:45:48 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-30 22:32:20.179669
Title: Score Distillation of Flow Matching Models
Title（参考訳）: フローマッチングモデルのスコア蒸留
Authors: Mingyuan Zhou, Yi Gu, Huangjie Zheng, Liangchen Song, Guande He, Yizhe Zhang, Wenze Hu, Yinfei Yang,
Abstract要約: 我々は、Score Identity Distillation (SiD) を事前訓練されたテキスト対画像フローマッチングモデルに拡張する。 SiDは、データフリーとデータアシストの両方の設定で、これらのモデルですぐに使える。これは、スコア蒸留がテキストと画像のフローマッチングモデルに広く適用されるという最初の体系的な証拠を提供する。
参考スコア（独自算出の注目度）: 67.86066177182046
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models achieve high-quality image generation but are limited by slow iterative sampling. Distillation methods alleviate this by enabling one- or few-step generation. Flow matching, originally introduced as a distinct framework, has since been shown to be theoretically equivalent to diffusion under Gaussian assumptions, raising the question of whether distillation techniques such as score distillation transfer directly. We provide a simple derivation -- based on Bayes' rule and conditional expectations -- that unifies Gaussian diffusion and flow matching without relying on ODE/SDE formulations. Building on this view, we extend Score identity Distillation (SiD) to pretrained text-to-image flow-matching models, including SANA, SD3-Medium, SD3.5-Medium/Large, and FLUX.1-dev, all with DiT backbones. Experiments show that, with only modest flow-matching- and DiT-specific adjustments, SiD works out of the box across these models, in both data-free and data-aided settings, without requiring teacher finetuning or architectural changes. This provides the first systematic evidence that score distillation applies broadly to text-to-image flow matching models, resolving prior concerns about stability and soundness and unifying acceleration techniques across diffusion- and flow-based generators. We will make the PyTorch implementation publicly available.
Abstract（参考訳）: 拡散モデルは高品質な画像生成を実現するが、遅い反復サンプリングによって制限される。蒸留法は、これを1段階または数段階の生成によって緩和する。当初、異なる枠組みとして導入されたフローマッチングは、ガウス的な仮定の下での拡散と理論的に等価であることが示され、スコア蒸留などの蒸留技術を直接的に導入するかどうかという疑問が提起された。我々は、 ODE/SDE の定式化に頼ることなく、ガウス拡散とフローマッチングを統一する単純な導出(ベイズの規則と条件付き期待に基づく)を提供する。このビューに基づいて、SANA、SD3-Medium、SD3.5-Medium/Large、FLUX.1-devといった事前訓練されたテキスト間フローマッチングモデルにSiD(Score Identity Distillation)を拡張する。実験によると、フローマッチングとDiT固有の調整だけで、SiDは教師の微調整やアーキテクチャの変更を必要とせず、データフリーとデータアシストの両方の設定で、これらのモデルにまたがって機能する。これは、スコア蒸留がテキストと画像のフローマッチングモデルに広く適用され、安定性と音質に関する事前の懸念を解消し、拡散およびフローベースジェネレータ間の加速技術を統一する最初の体系的な証拠である。 PyTorchの実装を公開します。

論文の概要: Score Distillation of Flow Matching Models

関連論文リスト