Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
- URL: http://arxiv.org/abs/2404.07724v2
- Date: Wed, 06 Nov 2024 14:29:36 GMT
- Title: Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
- Authors: Tuomas Kynkäänniemi, Miika Aittala, Tero Karras, Samuli Laine, Timo Aila, Jaakko Lehtinen,
- Abstract summary: We show that guidance is clearly harmful toward the beginning of the chain.
We thus restrict it to a specific range of noise levels, improving both the inference speed and result quality.
- Score: 35.61297232307485
- License:
- Abstract: Guidance is a crucial technique for extracting the best performance out of image-generating diffusion models. Traditionally, a constant guidance weight has been applied throughout the sampling chain of an image. We show that guidance is clearly harmful toward the beginning of the chain (high noise levels), largely unnecessary toward the end (low noise levels), and only beneficial in the middle. We thus restrict it to a specific range of noise levels, improving both the inference speed and result quality. This limited guidance interval improves the record FID in ImageNet-512 significantly, from 1.81 to 1.40. We show that it is quantitatively and qualitatively beneficial across different sampler parameters, network architectures, and datasets, including the large-scale setting of Stable Diffusion XL. We thus suggest exposing the guidance interval as a hyperparameter in all diffusion models that use guidance.
Related papers
- Robust Representation Consistency Model via Contrastive Denoising [83.47584074390842]
randomized smoothing provides theoretical guarantees for certifying robustness against adversarial perturbations.
diffusion models have been successfully employed for randomized smoothing to purify noise-perturbed samples.
We reformulate the generative modeling task along the diffusion trajectories in pixel space as a discriminative task in the latent space.
arXiv Detail & Related papers (2025-01-22T18:52:06Z) - A Noise is Worth Diffusion Guidance [36.912490607355295]
Current diffusion models struggle to produce reliable images without guidance.
We propose a novel method that replaces guidance methods with a single refinement of the initial noise.
Our noise-refining model leverages efficient noise-space learning, achieving rapid convergence and strong performance with just 50K text-image pairs.
arXiv Detail & Related papers (2024-12-05T06:09:56Z) - Latent Denoising Diffusion GAN: Faster sampling, Higher image quality [0.0]
Latent Denoising Diffusion GAN employs pre-trained autoencoders to compress images into a compact latent space.
Compared to its predecessors, DiffusionGAN and Wavelet Diffusion, our model shows remarkable improvements in all evaluation metrics.
arXiv Detail & Related papers (2024-06-17T16:32:23Z) - Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models [42.28905346604424]
Deep Reward Tuning (DRTune) is an algorithm that supervises the final output image of a text-to-image diffusion model.
DRTune consistently outperforms other algorithms, particularly for low-level control signals.
arXiv Detail & Related papers (2024-05-01T15:26:14Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - CADS: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling [27.795088366122297]
Condition-Annealed Diffusion Sampler (CADS) can be used with any pretrained model and sampling algorithm.
We show that it boosts the diversity of diffusion models in various conditional generation tasks.
arXiv Detail & Related papers (2023-10-26T12:27:56Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z) - Cascaded Diffusion Models for High Fidelity Image Generation [53.57766722279425]
We show that cascaded diffusion models are capable of generating high fidelity images on the class-conditional ImageNet generation challenge.
A cascaded diffusion model comprises a pipeline of multiple diffusion models that generate images of increasing resolution.
We find that the sample quality of a cascading pipeline relies crucially on conditioning augmentation.
arXiv Detail & Related papers (2021-05-30T17:14:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.