Accelerating Diffusion Models with One-to-Many Knowledge Distillation
- URL: http://arxiv.org/abs/2410.04191v1
- Date: Sat, 5 Oct 2024 15:10:04 GMT
- Title: Accelerating Diffusion Models with One-to-Many Knowledge Distillation
- Authors: Linfeng Zhang, Kaisheng Ma,
- Abstract summary: We introduce one-to-many knowledge distillation (O2MKD), which distills a single teacher diffusion model into multiple student diffusion models.
Experiments on CIFAR10, LSUN Church, CelebA-HQ with DDPM and COCO30K with Stable Diffusion show that O2MKD can be applied to previous knowledge distillation and fast sampling methods to achieve significant acceleration.
- Score: 35.130782477699704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Significant advancements in image generation have been made with diffusion models. Nevertheless, when contrasted with previous generative models, diffusion models face substantial computational overhead, leading to failure in real-time generation. Recent approaches have aimed to accelerate diffusion models by reducing the number of sampling steps through improved sampling techniques or step distillation. However, the methods to diminish the computational cost for each timestep remain a relatively unexplored area. Observing the fact that diffusion models exhibit varying input distributions and feature distributions at different timesteps, we introduce one-to-many knowledge distillation (O2MKD), which distills a single teacher diffusion model into multiple student diffusion models, where each student diffusion model is trained to learn the teacher's knowledge for a subset of continuous timesteps. Experiments on CIFAR10, LSUN Church, CelebA-HQ with DDPM and COCO30K with Stable Diffusion show that O2MKD can be applied to previous knowledge distillation and fast sampling methods to achieve significant acceleration. Codes will be released in Github.
Related papers
- DDIL: Improved Diffusion Distillation With Imitation Learning [57.3467234269487]
Diffusion models excel at generative modeling (e.g., text-to-image) but sampling requires multiple denoising network passes.
Progressive distillation or consistency distillation have shown promise by reducing the number of passes.
We show that DDIL consistency improves on baseline algorithms of progressive distillation (PD), Latent consistency models (LCM) and Distribution Matching Distillation (DMD2)
arXiv Detail & Related papers (2024-10-15T18:21:47Z) - Distillation of Discrete Diffusion through Dimensional Correlations [21.078500510691747]
"Mixture" models in discrete diffusion are capable of treating dimensional correlations while remaining scalable.
We empirically demonstrate that our proposed method for discrete diffusions work in practice, by distilling a continuous-time discrete diffusion model pretrained on the CIFAR-10 dataset.
arXiv Detail & Related papers (2024-10-11T10:53:03Z) - Relational Diffusion Distillation for Efficient Image Generation [27.127061578093674]
Diffusion model's high delay hinders its wide application in edge devices with scarce computing resources.
We propose Diffusion Distillation (RDD), a novel distillation method tailored specifically for distilling diffusion models.
Our proposed RDD leads to 1.47 FID decrease under 1 sampling step compared to state-of-the-art diffusion distillation methods and achieving 256x speed-up.
arXiv Detail & Related papers (2024-10-10T07:40:51Z) - Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution [81.81748032199813]
We propose a Distillation-Free One-Step Diffusion model.
Specifically, we propose a noise-aware discriminator (NAD) to participate in adversarial training.
We improve the perceptual loss with edge-aware DISTS (EA-DISTS) to enhance the model's ability to generate fine details.
arXiv Detail & Related papers (2024-10-05T16:41:36Z) - Variational Distillation of Diffusion Policies into Mixture of Experts [26.315682445979302]
This work introduces Variational Diffusion Distillation (VDD), a novel method that distills denoising diffusion policies into Mixtures of Experts (MoE)
Diffusion Models are the current state-of-the-art in generative modeling due to their exceptional ability to accurately learn and represent complex, multi-modal distributions.
VDD is the first method that distills pre-trained diffusion models into MoE models, and hence, combines the expressiveness of Diffusion Models with the benefits of Mixture Models.
arXiv Detail & Related papers (2024-06-18T12:15:05Z) - SFDDM: Single-fold Distillation for Diffusion models [4.688721356965585]
We propose a single-fold distillation algorithm, SFDDM, which can flexibly compress the teacher diffusion model into a student model of any desired step.
Experiments on four datasets demonstrate that SFDDM is able to sample high-quality data with steps reduced to as little as approximately 1%.
arXiv Detail & Related papers (2024-05-23T18:11:14Z) - Guided Diffusion from Self-Supervised Diffusion Features [49.78673164423208]
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or pretraining.
We propose a framework to extract guidance from, and specifically for, diffusion models.
arXiv Detail & Related papers (2023-12-14T11:19:11Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - Structural Pruning for Diffusion Models [65.02607075556742]
We present Diff-Pruning, an efficient compression method tailored for learning lightweight diffusion models from pre-existing ones.
Our empirical assessment, undertaken across several datasets highlights two primary benefits of our proposed method.
arXiv Detail & Related papers (2023-05-18T12:38:21Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.