Related papers: Accelerating Diffusion Sampling with Classifier-based Feature Distillation

Accelerating Diffusion Sampling with Classifier-based Feature Distillation

URL: http://arxiv.org/abs/2211.12039v1
Date: Tue, 22 Nov 2022 06:21:31 GMT
Title: Accelerating Diffusion Sampling with Classifier-based Feature Distillation
Authors: Wujie Sun, Defang Chen, Can Wang, Deshi Ye, Yan Feng, Chun Chen
Abstract summary: Progressive distillation is proposed for fast sampling by progressively aligning output images of $N$-step teacher sampler with $N/2$-step student sampler. We distill teacher's sharpened feature distribution into the student with a dataset-independent classifier to improve performance. Experiments on CIFAR-10 show the superiority of our method in achieving high quality and fast sampling.
Score: 20.704675568555082
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although diffusion model has shown great potential for generating higher quality images than GANs, slow sampling speed hinders its wide application in practice. Progressive distillation is thus proposed for fast sampling by progressively aligning output images of $N$-step teacher sampler with $N/2$-step student sampler. In this paper, we argue that this distillation-based accelerating method can be further improved, especially for few-step samplers, with our proposed \textbf{C}lassifier-based \textbf{F}eature \textbf{D}istillation (CFD). Instead of aligning output images, we distill teacher's sharpened feature distribution into the student with a dataset-independent classifier, making the student focus on those important features to improve performance. We also introduce a dataset-oriented loss to further optimize the model. Experiments on CIFAR-10 show the superiority of our method in achieving high quality and fast sampling. Code will be released soon.

Related papers

Noise Conditional Variational Score Distillation [60.38982038894823]
Noise Conditional Variational Score Distillation (NCVSD) is a novel method for distilling pretrained diffusion models into generative denoisers.<n>By integrating this insight into the Variational Score Distillation framework, we enable scalable learning of generative denoisers.
arXiv Detail & Related papers (2025-06-11T06:01:39Z)
Denoising Score Distillation: From Noisy Diffusion Pretraining to One-Step High-Quality Generation [82.39763984380625]
We introduce denoising score distillation (DSD), a surprisingly effective and novel approach for training high-quality generative models from low-quality data. DSD pretrains a diffusion model exclusively on noisy, corrupted samples and then distills it into a one-step generator capable of producing refined, clean outputs.
arXiv Detail & Related papers (2025-03-10T17:44:46Z)
Inference-Time Diffusion Model Distillation [59.350789627086456]
We introduce Distillation++, a novel inference-time distillation framework. Inspired by recent advances in conditional sampling, our approach recasts student model sampling as a proximal optimization problem. We integrate distillation optimization during reverse sampling, which can be viewed as teacher guidance.
arXiv Detail & Related papers (2024-12-12T02:07:17Z)
Gradient-Free Classifier Guidance for Diffusion Model Sampling [4.450496470631169]
Gradient-free Guidance (GFCG) method consistently improves class prediction accuracy. For ImageNet 512$times$512, we achieve a record $FD_textDINOv2$ 23.09, while simultaneously attaining a higher classification Precision (94.3%) compared to ATG (90.2%)
arXiv Detail & Related papers (2024-11-23T00:22:21Z)
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization [97.35427957922714]
We present an algorithm named pairwise sample optimization (PSO), which enables the direct fine-tuning of an arbitrary timestep-distilled diffusion model. PSO introduces additional reference images sampled from the current time-step distilled model, and increases the relative likelihood margin between the training images and reference images. We show that PSO can directly adapt distilled models to human-preferred generation with both offline and online-generated pairwise preference image data.
arXiv Detail & Related papers (2024-10-04T07:05:16Z)
Simple and Fast Distillation of Diffusion Models [39.79747569096888]
We propose Simple and Fast Distillation (SFD) of diffusion models, which simplifies the paradigm used in existing methods. SFD achieves 4.53 FID (NFE=2) on CIFAR-10 with only 0.64 hours of fine-tuning on a single NVIDIA A100 GPU.
arXiv Detail & Related papers (2024-09-29T12:13:06Z)
One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts. Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation. We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z)
Diffusion Rejection Sampling [13.945372555871414]
Diffusion Rejection Sampling (DiffRS) is a rejection sampling scheme that aligns the sampling transition kernels with the true ones at each timestep. The proposed method can be viewed as a mechanism that evaluates the quality of samples at each intermediate timestep and refines them with varying effort depending on the sample. Empirical results demonstrate the state-of-the-art performance of DiffRS on the benchmark datasets and the effectiveness of DiffRS for fast diffusion samplers and large-scale text-to-image diffusion models.
arXiv Detail & Related papers (2024-05-28T07:00:28Z)
AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation [43.62480338471837]
Blind super-resolution methods based on stable diffusion showcase formidable generative capabilities in reconstructing clear high-resolution images with intricate details from low-resolution inputs. Their practical applicability is often hampered by poor efficiency, stemming from the requirement of thousands or hundreds of sampling steps. Inspired by the efficient adversarial diffusion distillation (ADD), we designnameto address this issue by incorporating the ideas of both distillation and ControlNet.
arXiv Detail & Related papers (2024-04-02T08:07:38Z)
DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models [58.450152413700586]
We introduce a soft absorbing state that facilitates the diffusion model in learning to reconstruct discrete mutations based on the underlying Gaussian space. We employ state-of-the-art ODE solvers within the continuous space to expedite the sampling process. Our proposed method effectively accelerates the training convergence by 4x and generates samples of similar quality 800x faster.
arXiv Detail & Related papers (2023-10-09T15:29:10Z)
Boosting Diffusion Models with an Adaptive Momentum Sampler [21.88226514633627]
We present a novel reverse sampler for DPMs inspired by the widely-used Adam sampler. Our proposed sampler can be readily applied to a pre-trained diffusion model. By implicitly reusing update directions from early steps, our proposed sampler achieves a better balance between high-level semantics and low-level details.
arXiv Detail & Related papers (2023-08-23T06:22:02Z)
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images. Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few. We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech [63.780196620966905]
We propose ProDiff, on progressive fast diffusion model for high-quality text-to-speech. ProDiff parameterizes the denoising model by directly predicting clean data to avoid distinct quality degradation in accelerating sampling. Our evaluation demonstrates that ProDiff needs only 2 iterations to synthesize high-fidelity mel-spectrograms. ProDiff enables a sampling speed of 24x faster than real-time on a single NVIDIA 2080Ti GPU.
arXiv Detail & Related papers (2022-07-13T17:45:43Z)
Denoising Diffusion Implicit Models [117.03720513930335]
We present denoising diffusion implicit models (DDIMs) for iterative implicit probabilistic models with the same training procedure as DDPMs. DDIMs can produce high quality samples $10 times$ to $50 times$ faster in terms of wall-clock time compared to DDPMs.
arXiv Detail & Related papers (2020-10-06T06:15:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.