Fugu-MT 論文翻訳(概要): Score-based Idempotent Distillation of Diffusion Models

論文の概要: Score-based Idempotent Distillation of Diffusion Models

arxiv url: http://arxiv.org/abs/2509.21470v1
Date: Thu, 25 Sep 2025 19:36:10 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-29 20:57:53.946376
Title: Score-based Idempotent Distillation of Diffusion Models
Title（参考訳）: 拡散モデルのスコアベース等等化蒸留
Authors: Shehtab Zaman, Chengyan Liu, Kenneth Chiu,
Abstract要約: 一等性生成ネットワーク(IGN)は、対象多様体への等等性写像に基づく新しい生成モデルである。本研究では拡散モデルスコアから等等化モデルを蒸留することにより拡散とIGNを結合し、SIGNと呼ぶ。提案手法は非常に安定しており, 対向的損失を伴わないため, 提案手法の理論的解析を行い, IGNを予め学習した拡散モデルから効果的に蒸留できることを実証的に示す。
参考スコア（独自算出の注目度）: 0.9367224590861915
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Idempotent generative networks (IGNs) are a new line of generative models based on idempotent mapping to a target manifold. IGNs support both single-and multi-step generation, allowing for a flexible trade-off between computational cost and sample quality. But similar to Generative Adversarial Networks (GANs), conventional IGNs require adversarial training and are prone to training instabilities and mode collapse. Diffusion and score-based models are popular approaches to generative modeling that iteratively transport samples from one distribution, usually a Gaussian, to a target data distribution. These models have gained popularity due to their stable training dynamics and high-fidelity generation quality. However, this stability and quality come at the cost of high computational cost, as the data must be transported incrementally along the entire trajectory. New sampling methods, model distillation, and consistency models have been developed to reduce the sampling cost and even perform one-shot sampling from diffusion models. In this work, we unite diffusion and IGNs by distilling idempotent models from diffusion model scores, called SIGN. Our proposed method is highly stable and does not require adversarial losses. We provide a theoretical analysis of our proposed score-based training methods and empirically show that IGNs can be effectively distilled from a pre-trained diffusion model, enabling faster inference than iterative score-based models. SIGNs can perform multi-step sampling, allowing users to trade off quality for efficiency. These models operate directly on the source domain; they can project corrupted or alternate distributions back onto the target manifold, enabling zero-shot editing of inputs. We validate our models on multiple image datasets, achieving state-of-the-art results for idempotent models on the CIFAR and CelebA datasets.
Abstract（参考訳）: 一等性生成ネットワーク(IGN)は、対象多様体への等等性写像に基づく新しい生成モデルである。 IGNはシングルステップとマルチステップの両方をサポートし、計算コストとサンプル品質の間の柔軟なトレードオフを可能にする。しかし、GAN(Generative Adversarial Networks)と同様に、従来のIGNは敵の訓練を必要とし、不安定性やモード崩壊の訓練を行う傾向がある。拡散モデルとスコアベースモデルは、サンプルを1つの分布(通常ガウス分布)から対象データ分布へ反復的に輸送する生成モデルに対する一般的なアプローチである。これらのモデルは、安定したトレーニングダイナミクスと高忠実度生成品質のために人気を博している。しかし、この安定性と品質は、データが軌道全体に沿って漸進的に輸送されなければならないため、高い計算コストがかかる。新しいサンプリング法, モデル蒸留法, 整合性モデルが開発され, サンプリングコストを低減し, 拡散モデルからのワンショットサンプリングも行われている。本研究では拡散モデルスコアから等等化モデルを蒸留することにより拡散とIGNを結合し、SIGNと呼ぶ。提案手法は非常に安定しており, 敵の損失を必要としない。本稿では,提案手法の理論的解析を行い,IGNを事前学習した拡散モデルから効果的に蒸留できることを実証的に示し,反復的なスコアベースモデルよりも高速な推論を可能にする。 SIGNはマルチステップサンプリングを実行でき、ユーザーは効率のために品質をトレードオフできる。これらのモデルはソースドメイン上で直接動作し、出力のゼロショット編集を可能にするために、破損したあるいは別の分布をターゲット多様体に投影することができる。我々は,CIFARおよびCelebAデータセット上での等等性モデルに対して,複数の画像データセットに対するモデルの有効性を検証した。

論文の概要: Score-based Idempotent Distillation of Diffusion Models

関連論文リスト