Reducing Spatial Fitting Error in Distillation of Denoising Diffusion
Models
- URL: http://arxiv.org/abs/2311.03830v2
- Date: Thu, 21 Dec 2023 15:18:34 GMT
- Title: Reducing Spatial Fitting Error in Distillation of Denoising Diffusion
Models
- Authors: Shengzhe Zhou, Zejian Lee, Shengyuan Zhang, Lefan Hou, Changyuan Yang,
Guang Yang, Zhiyuan Yang, Lingyun Sun
- Abstract summary: Knowledge distillation for diffusion models is an effective method to address this limitation with a shortened sampling process.
We attribute the degradation to the spatial fitting error occurring in the training of both the teacher and student model.
SFERD utilizes attention guidance from the teacher model and a designed semantic gradient predictor to reduce the student's fitting error.
We achieve an FID of 5.31 on CIFAR-10 and 9.39 on ImageNet 64$times$64 with only one step, outperforming existing diffusion methods.
- Score: 13.364271265023953
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Denoising Diffusion models have exhibited remarkable capabilities in image
generation. However, generating high-quality samples requires a large number of
iterations. Knowledge distillation for diffusion models is an effective method
to address this limitation with a shortened sampling process but causes
degraded generative quality. Based on our analysis with bias-variance
decomposition and experimental observations, we attribute the degradation to
the spatial fitting error occurring in the training of both the teacher and
student model. Accordingly, we propose $\textbf{S}$patial
$\textbf{F}$itting-$\textbf{E}$rror $\textbf{R}$eduction
$\textbf{D}$istillation model ($\textbf{SFERD}$). SFERD utilizes attention
guidance from the teacher model and a designed semantic gradient predictor to
reduce the student's fitting error. Empirically, our proposed model facilitates
high-quality sample generation in a few function evaluations. We achieve an FID
of 5.31 on CIFAR-10 and 9.39 on ImageNet 64$\times$64 with only one step,
outperforming existing diffusion methods. Our study provides a new perspective
on diffusion distillation by highlighting the intrinsic denoising ability of
models. Project link: \url{https://github.com/Sainzerjj/SFERD}.
Related papers
- Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution [81.81748032199813]
We propose a Distillation-Free One-Step Diffusion model.
Specifically, we propose a noise-aware discriminator (NAD) to participate in adversarial training.
We improve the perceptual loss with edge-aware DISTS (EA-DISTS) to enhance the model's ability to generate fine details.
arXiv Detail & Related papers (2024-10-05T16:41:36Z) - Informed Correctors for Discrete Diffusion Models [32.87362154118195]
We propose a family of informed correctors that more reliably counteracts discretization error by leveraging information learned by the model.
We also propose $k$-Gillespie's, a sampling algorithm that better utilizes each model evaluation, while still enjoying the speed and flexibility of $tau$-leaping.
Across several real and synthetic datasets, we show that $k$-Gillespie's with informed correctors reliably produces higher quality samples at lower computational cost.
arXiv Detail & Related papers (2024-07-30T23:29:29Z) - EM Distillation for One-step Diffusion Models [65.57766773137068]
We propose a maximum likelihood-based approach that distills a diffusion model to a one-step generator model with minimal loss of quality.
We develop a reparametrized sampling scheme and a noise cancellation technique that together stabilizes the distillation process.
arXiv Detail & Related papers (2024-05-27T05:55:22Z) - Distilling Diffusion Models into Conditional GANs [90.76040478677609]
We distill a complex multistep diffusion model into a single-step conditional GAN student model.
For efficient regression loss, we propose E-LatentLPIPS, a perceptual loss operating directly in diffusion model's latent space.
We demonstrate that our one-step generator outperforms cutting-edge one-step diffusion distillation models.
arXiv Detail & Related papers (2024-05-09T17:59:40Z) - Bridging the Gap: Addressing Discrepancies in Diffusion Model Training
for Classifier-Free Guidance [1.6804613362826175]
Diffusion models have emerged as a pivotal advancement in generative models.
In this paper we aim to underscore a discrepancy between conventional training methods and the desired conditional sampling behavior.
We introduce an updated loss function that better aligns training objectives with sampling behaviors.
arXiv Detail & Related papers (2023-11-02T02:03:12Z) - Soft Mixture Denoising: Beyond the Expressive Bottleneck of Diffusion
Models [76.46246743508651]
We show that current diffusion models actually have an expressive bottleneck in backward denoising.
We introduce soft mixture denoising (SMD), an expressive and efficient model for backward denoising.
arXiv Detail & Related papers (2023-09-25T12:03:32Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - Interpreting and Improving Diffusion Models from an Optimization Perspective [4.5993996573872185]
We use this observation to interpret denoising diffusion models as approximate gradient descent applied to the Euclidean distance function.
We propose a new gradient-estimation sampler, generalizing DDIM using insights from our theoretical results.
arXiv Detail & Related papers (2023-06-08T00:56:33Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z) - Improved Denoising Diffusion Probabilistic Models [4.919647298882951]
We show that DDPMs can achieve competitive log-likelihoods while maintaining high sample quality.
We also find that learning variances of the reverse diffusion process allows sampling with an order of magnitude fewer forward passes.
We show that the sample quality and likelihood of these models scale smoothly with model capacity and training compute, making them easily scalable.
arXiv Detail & Related papers (2021-02-18T23:44:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.