One-step Diffusion Models with $f$-Divergence Distribution Matching
- URL: http://arxiv.org/abs/2502.15681v2
- Date: Sun, 09 Mar 2025 22:53:27 GMT
- Title: One-step Diffusion Models with $f$-Divergence Distribution Matching
- Authors: Yilun Xu, Weili Nie, Arash Vahdat,
- Abstract summary: Recent approaches distill a multi-step diffusion model into a single-step student generator via variational score distillation.<n>These approaches use the reverse Kullback-Leibler (KL) divergence for distribution matching which is known to be mode seeking.<n>In this paper, we generalize the distribution matching approach using a novel $f$-divergence minimization framework.
- Score: 41.21390253053562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sampling from diffusion models involves a slow iterative process that hinders their practical deployment, especially for interactive applications. To accelerate generation speed, recent approaches distill a multi-step diffusion model into a single-step student generator via variational score distillation, which matches the distribution of samples generated by the student to the teacher's distribution. However, these approaches use the reverse Kullback-Leibler (KL) divergence for distribution matching which is known to be mode seeking. In this paper, we generalize the distribution matching approach using a novel $f$-divergence minimization framework, termed $f$-distill, that covers different divergences with different trade-offs in terms of mode coverage and training variance. We derive the gradient of the $f$-divergence between the teacher and student distributions and show that it is expressed as the product of their score differences and a weighting function determined by their density ratio. This weighting function naturally emphasizes samples with higher density in the teacher distribution, when using a less mode-seeking divergence. We observe that the popular variational score distillation approach using the reverse-KL divergence is a special case within our framework. Empirically, we demonstrate that alternative $f$-divergences, such as forward-KL and Jensen-Shannon divergences, outperform the current best variational score distillation methods across image generation tasks. In particular, when using Jensen-Shannon divergence, $f$-distill achieves current state-of-the-art one-step generation performance on ImageNet64 and zero-shot text-to-image generation on MS-COCO. Project page: https://research.nvidia.com/labs/genair/f-distill
Related papers
- Non-Normal Diffusion Models [3.5534933448684134]
Diffusion models generate samples by incrementally reversing a process that turns data into noise.
We show that when the step size goes to zero, the reversed process is invariant to the distribution of these increments.
We demonstrate the effectiveness of these models on density estimation and generative modeling tasks on standard image datasets.
arXiv Detail & Related papers (2024-12-10T21:31:12Z) - Training Neural Samplers with Reverse Diffusive KL Divergence [36.549460449020906]
Training generative models to sample from unnormalized density functions is an important and challenging task in machine learning.
Traditional training methods often rely on the reverse Kullback-Leibler (KL) divergence due to its tractability.
We propose to minimize the reverse KL along diffusion trajectories of both model and target densities.
We demonstrate that our method enhances sampling performance across various Boltzmann distributions.
arXiv Detail & Related papers (2024-10-16T11:08:02Z) - DDIL: Improved Diffusion Distillation With Imitation Learning [57.3467234269487]
Diffusion models excel at generative modeling (e.g., text-to-image) but sampling requires multiple denoising network passes.
Progressive distillation or consistency distillation have shown promise by reducing the number of passes.
We show that DDIL consistency improves on baseline algorithms of progressive distillation (PD), Latent consistency models (LCM) and Distribution Matching Distillation (DMD2)
arXiv Detail & Related papers (2024-10-15T18:21:47Z) - Unleashing the Power of One-Step Diffusion based Image Super-Resolution via a Large-Scale Diffusion Discriminator [81.81748032199813]
Diffusion models have demonstrated excellent performance for real-world image super-resolution (Real-ISR)
We propose a new One-Step textbfDiffusion model with a larger-scale textbfDiscriminator for SR.
Our discriminator is able to distill noisy features from any time step of diffusion models in the latent space.
arXiv Detail & Related papers (2024-10-05T16:41:36Z) - One-step Diffusion with Distribution Matching Distillation [54.723565605974294]
We introduce Distribution Matching Distillation (DMD), a procedure to transform a diffusion model into a one-step image generator.
We enforce the one-step image generator match the diffusion model at distribution level, by minimizing an approximate KL divergence.
Our method outperforms all published few-step diffusion approaches, reaching 2.62 FID on ImageNet 64x64 and 11.49 FID on zero-shot COCO-30k.
arXiv Detail & Related papers (2023-11-30T18:59:20Z) - Gaussian Mixture Solvers for Diffusion Models [84.83349474361204]
We introduce a novel class of SDE-based solvers called GMS for diffusion models.
Our solver outperforms numerous SDE-based solvers in terms of sample quality in image generation and stroke-based synthesis.
arXiv Detail & Related papers (2023-11-02T02:05:38Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.