Related papers: Common Diffusion Noise Schedules and Sample Steps are Flawed

Common Diffusion Noise Schedules and Sample Steps are Flawed

URL: http://arxiv.org/abs/2305.08891v4
Date: Tue, 23 Jan 2024 03:08:28 GMT
Title: Common Diffusion Noise Schedules and Sample Steps are Flawed
Authors: Shanchuan Lin, Bingchen Liu, Jiashi Li, Xiao Yang
Abstract summary: Common diffusion noise schedules do not enforce the last timestep to have zero signal-to-noise ratio. Some implementations of diffusion samplers do not start from the last timestep. We show that the flawed design causes real problems in existing implementations.
Score: 7.802281665410233
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We discover that common diffusion noise schedules do not enforce the last timestep to have zero signal-to-noise ratio (SNR), and some implementations of diffusion samplers do not start from the last timestep. Such designs are flawed and do not reflect the fact that the model is given pure Gaussian noise at inference, creating a discrepancy between training and inference. We show that the flawed design causes real problems in existing implementations. In Stable Diffusion, it severely limits the model to only generate images with medium brightness and prevents it from generating very bright and dark samples. We propose a few simple fixes: (1) rescale the noise schedule to enforce zero terminal SNR; (2) train the model with v prediction; (3) change the sampler to always start from the last timestep; (4) rescale classifier-free guidance to prevent over-exposure. These simple changes ensure the diffusion process is congruent between training and inference and allow the model to generate samples more faithful to the original data distribution.

Related papers

ADT: Tuning Diffusion Models with Adversarial Supervision [16.974169058917443]
Diffusion models have achieved outstanding image generation by reversing a forward noising process to approximate true data distributions. We propose Adrial Diffusion Tuning (ADT) to stimulate the inference process during optimization and align the final outputs with training data. ADT features a siamese-network discriminator with a fixed pre-trained backbone and lightweight trainable parameters.
arXiv Detail & Related papers (2025-04-15T17:37:50Z)
Distributional Diffusion Models with Scoring Rules [83.38210785728994]
Diffusion models generate high-quality synthetic data. generating high-quality outputs requires many discretization steps. We propose to accomplish sample generation by learning the posterior em distribution of clean data samples.
arXiv Detail & Related papers (2025-02-04T16:59:03Z)
Optimizing Noise Schedules of Generative Models in High Dimensionss [18.19470017419402]
We show that noise schedules specific to preserving variance (VP) and variance exploding (VE) allow for the recovery of both high- and low-level features. We also show that these schedules yield generative models for the GM and Curie-Weiss (CW) model whose probability flow ODE can be discretized.
arXiv Detail & Related papers (2025-01-02T00:39:00Z)
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow [65.51671121528858]
Diffusion models have greatly improved visual generation but are hindered by slow generation speed due to the computationally intensive nature of solving generative ODEs. Rectified flow, a widely recognized solution, improves generation speed by straightening the ODE path. We propose Rectified Diffusion, which generalizes the design space and application scope of rectification to encompass the broader category of diffusion models.
arXiv Detail & Related papers (2024-10-09T17:43:38Z)
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data [74.2507346810066]
Ambient diffusion is a recently proposed framework for training diffusion models using corrupted data. We present the first framework for training diffusion models that provably sample from the uncorrupted distribution given only noisy training data.
arXiv Detail & Related papers (2024-03-20T14:22:12Z)
Particle Denoising Diffusion Sampler [32.310922004771776]
Particle Denoising Diffusion Sampler (PDDS) provides consistent estimates under mild assumptions. We demonstrate PDDS on multimodal and high dimensional sampling tasks.
arXiv Detail & Related papers (2024-02-09T11:01:35Z)
One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls [77.42510898755037]
One More Step (OMS) is a compact network that incorporates an additional simple yet effective step during inference. OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters. Once trained, various pre-trained diffusion models with the same latent domain can share the same OMS module.
arXiv Detail & Related papers (2023-11-27T12:02:42Z)
Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner [84.97253871387028]
A diffusion model, which is formulated to produce an image using thousands of denoising steps, usually suffers from a slow inference speed. We propose a timestep aligner that helps find a more accurate integral direction for a particular interval at the minimum cost. Experiments show that our plug-in design can be trained efficiently and boost the inference performance of various state-of-the-art acceleration methods.
arXiv Detail & Related papers (2023-10-14T02:19:07Z)
UDPM: Upsampling Diffusion Probabilistic Models [33.51145642279836]
Denoising Diffusion Probabilistic Models (DDPM) have recently gained significant attention. DDPMs generate high-quality samples from complex data distributions by defining an inverse process. Unlike generative adversarial networks (GANs), the latent space of diffusion models is less interpretable. In this work, we propose to generalize the denoising diffusion process into an Upsampling Diffusion Probabilistic Model (UDPM)
arXiv Detail & Related papers (2023-05-25T17:25:14Z)
Denoising Diffusion Samplers [41.796349001299156]
Denoising diffusion models are a popular class of generative models providing state-of-the-art results in many domains. We explore a similar idea to sample approximately from unnormalized probability density functions and estimate their normalizing constants. While score matching is not applicable in this context, we can leverage many of the ideas introduced in generative modeling for Monte Carlo sampling.
arXiv Detail & Related papers (2023-02-27T14:37:16Z)
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech [63.780196620966905]
We propose ProDiff, on progressive fast diffusion model for high-quality text-to-speech. ProDiff parameterizes the denoising model by directly predicting clean data to avoid distinct quality degradation in accelerating sampling. Our evaluation demonstrates that ProDiff needs only 2 iterations to synthesize high-fidelity mel-spectrograms. ProDiff enables a sampling speed of 24x faster than real-time on a single NVIDIA 2080Ti GPU.
arXiv Detail & Related papers (2022-07-13T17:45:43Z)
Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial Auto-Encoders [137.1060633388405]
Diffusion-based generative models learn how to generate the data by inferring a reverse diffusion chain. We propose a faster and cheaper approach that adds noise not until the data become pure random noise. We show that the proposed model can be cast as an adversarial auto-encoder empowered by both the diffusion process and a learnable implicit prior.
arXiv Detail & Related papers (2022-02-19T20:18:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.