AdaDiff: Adaptive Step Selection for Fast Diffusion
- URL: http://arxiv.org/abs/2311.14768v1
- Date: Fri, 24 Nov 2023 11:20:38 GMT
- Title: AdaDiff: Adaptive Step Selection for Fast Diffusion
- Authors: Hui Zhang and Zuxuan Wu and Zhen Xing and Jie Shao and Yu-Gang Jiang
- Abstract summary: We introduce AdaDiff, a framework designed to learn instance-specific step usage policies.
AdaDiff is optimized using a policy gradient method to maximize a carefully designed reward function.
Our approach achieves similar results in terms of visual quality compared to the baseline using a fixed 50 denoising steps.
- Score: 88.8198344514677
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models, as a type of generative models, have achieved impressive
results in generating images and videos conditioned on textual conditions.
However, the generation process of diffusion models involves denoising for
dozens of steps to produce photorealistic images/videos, which is
computationally expensive. Unlike previous methods that design
``one-size-fits-all'' approaches for speed up, we argue denoising steps should
be sample-specific conditioned on the richness of input texts. To this end, we
introduce AdaDiff, a lightweight framework designed to learn instance-specific
step usage policies, which are then used by the diffusion model for generation.
AdaDiff is optimized using a policy gradient method to maximize a carefully
designed reward function, balancing inference time and generation quality. We
conduct experiments on three image generation and two video generation
benchmarks and demonstrate that our approach achieves similar results in terms
of visual quality compared to the baseline using a fixed 50 denoising steps
while reducing inference time by at least 33%, going as high as 40%.
Furthermore, our qualitative analysis shows that our method allocates more
steps to more informative text conditions and fewer steps to simpler text
conditions.
Related papers
- Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
Diffusion models have dominated the field of large, generative image models.
We propose an algorithm for fast-constrained sampling in large pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-24T14:52:38Z) - Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy [44.09909260046396]
We propose AdaptiveDiffusion to reduce noise prediction steps during the denoising process.
Our method can significantly speed up the denoising process while generating identical results to the original process, achieving up to an average 25x speedup.
arXiv Detail & Related papers (2024-10-13T15:19:18Z) - Fast LiDAR Upsampling using Conditional Diffusion Models [1.3709133749179265]
Existing approaches have shown the possibilities for using diffusion models to generate refined LiDAR data with high fidelity.
We introduce a novel approach based on conditional diffusion models for fast and high-quality sparse-to-dense upsampling of 3D scene point clouds.
Our method employs denoising diffusion probabilistic models trained with conditional inpainting masks, which have been shown to give high performance on image completion tasks.
arXiv Detail & Related papers (2024-05-08T08:38:28Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - AutoDiffusion: Training-Free Optimization of Time Steps and
Architectures for Automated Diffusion Model Acceleration [57.846038404893626]
We propose to search the optimal time steps sequence and compressed model architecture in a unified framework to achieve effective image generation for diffusion models without any further training.
Experimental results show that our method achieves excellent performance by using only a few time steps, e.g. 17.86 FID score on ImageNet 64 $times$ 64 with only four steps, compared to 138.66 with DDIM.
arXiv Detail & Related papers (2023-09-19T08:57:24Z) - HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion
Guidance [19.252300247300145]
This work proposes holistic sampling and smoothing approaches to achieve high-quality text-to-3D generation.
We compute denoising scores in the text-to-image diffusion model's latent and image spaces.
To generate high-quality renderings in a single-stage optimization, we propose regularization for the variance of z-coordinates along NeRF rays.
arXiv Detail & Related papers (2023-05-30T05:56:58Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z) - Dynamic Dual-Output Diffusion Models [100.32273175423146]
Iterative denoising-based generation has been shown to be comparable in quality to other classes of generative models.
A major drawback of this method is that it requires hundreds of iterations to produce a competitive result.
Recent works have proposed solutions that allow for faster generation with fewer iterations, but the image quality gradually deteriorates.
arXiv Detail & Related papers (2022-03-08T11:20:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.