Common Diffusion Noise Schedules and Sample Steps are Flawed
- URL: http://arxiv.org/abs/2305.08891v4
- Date: Tue, 23 Jan 2024 03:08:28 GMT
- Title: Common Diffusion Noise Schedules and Sample Steps are Flawed
- Authors: Shanchuan Lin, Bingchen Liu, Jiashi Li, Xiao Yang
- Abstract summary: Common diffusion noise schedules do not enforce the last timestep to have zero signal-to-noise ratio.
Some implementations of diffusion samplers do not start from the last timestep.
We show that the flawed design causes real problems in existing implementations.
- Score: 7.802281665410233
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We discover that common diffusion noise schedules do not enforce the last
timestep to have zero signal-to-noise ratio (SNR), and some implementations of
diffusion samplers do not start from the last timestep. Such designs are flawed
and do not reflect the fact that the model is given pure Gaussian noise at
inference, creating a discrepancy between training and inference. We show that
the flawed design causes real problems in existing implementations. In Stable
Diffusion, it severely limits the model to only generate images with medium
brightness and prevents it from generating very bright and dark samples. We
propose a few simple fixes: (1) rescale the noise schedule to enforce zero
terminal SNR; (2) train the model with v prediction; (3) change the sampler to
always start from the last timestep; (4) rescale classifier-free guidance to
prevent over-exposure. These simple changes ensure the diffusion process is
congruent between training and inference and allow the model to generate
samples more faithful to the original data distribution.
Related papers
- Distributional Diffusion Models with Scoring Rules [83.38210785728994]
Diffusion models generate high-quality synthetic data.
generating high-quality outputs requires many discretization steps.
We propose to accomplish sample generation by learning the posterior em distribution of clean data samples.
arXiv Detail & Related papers (2025-02-04T16:59:03Z) - Optimizing Noise Schedules of Generative Models in High Dimensionss [18.19470017419402]
We show that noise schedules specific to preserving variance (VP) and variance exploding (VE) allow for the recovery of both high- and low-level features.
We also show that these schedules yield generative models for the GM and Curie-Weiss (CW) model whose probability flow ODE can be discretized.
arXiv Detail & Related papers (2025-01-02T00:39:00Z) - Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow [65.51671121528858]
Diffusion models have greatly improved visual generation but are hindered by slow generation speed due to the computationally intensive nature of solving generative ODEs.
Rectified flow, a widely recognized solution, improves generation speed by straightening the ODE path.
We propose Rectified Diffusion, which generalizes the design space and application scope of rectification to encompass the broader category of diffusion models.
arXiv Detail & Related papers (2024-10-09T17:43:38Z) - Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data [74.2507346810066]
Ambient diffusion is a recently proposed framework for training diffusion models using corrupted data.
We present the first framework for training diffusion models that provably sample from the uncorrupted distribution given only noisy training data.
arXiv Detail & Related papers (2024-03-20T14:22:12Z) - One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion
Schedule Flaws and Enhancing Low-Frequency Controls [77.42510898755037]
One More Step (OMS) is a compact network that incorporates an additional simple yet effective step during inference.
OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.
Once trained, various pre-trained diffusion models with the same latent domain can share the same OMS module.
arXiv Detail & Related papers (2023-11-27T12:02:42Z) - Denoising Diffusion Samplers [41.796349001299156]
Denoising diffusion models are a popular class of generative models providing state-of-the-art results in many domains.
We explore a similar idea to sample approximately from unnormalized probability density functions and estimate their normalizing constants.
While score matching is not applicable in this context, we can leverage many of the ideas introduced in generative modeling for Monte Carlo sampling.
arXiv Detail & Related papers (2023-02-27T14:37:16Z) - ProDiff: Progressive Fast Diffusion Model For High-Quality
Text-to-Speech [63.780196620966905]
We propose ProDiff, on progressive fast diffusion model for high-quality text-to-speech.
ProDiff parameterizes the denoising model by directly predicting clean data to avoid distinct quality degradation in accelerating sampling.
Our evaluation demonstrates that ProDiff needs only 2 iterations to synthesize high-fidelity mel-spectrograms.
ProDiff enables a sampling speed of 24x faster than real-time on a single NVIDIA 2080Ti GPU.
arXiv Detail & Related papers (2022-07-13T17:45:43Z) - Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial
Auto-Encoders [137.1060633388405]
Diffusion-based generative models learn how to generate the data by inferring a reverse diffusion chain.
We propose a faster and cheaper approach that adds noise not until the data become pure random noise.
We show that the proposed model can be cast as an adversarial auto-encoder empowered by both the diffusion process and a learnable implicit prior.
arXiv Detail & Related papers (2022-02-19T20:18:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.