Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models
- URL: http://arxiv.org/abs/2306.09251v3
- Date: Thu, 7 Mar 2024 03:30:57 GMT
- Title: Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models
- Authors: Gen Li, Yuting Wei, Yuxin Chen, Yuejie Chi
- Abstract summary: We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
- Score: 49.81937966106691
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models, which convert noise into new data instances by learning to
reverse a Markov diffusion process, have become a cornerstone in contemporary
generative modeling. While their practical power has now been widely
recognized, the theoretical underpinnings remain far from mature. In this work,
we develop a suite of non-asymptotic theory towards understanding the data
generation process of diffusion models in discrete time, assuming access to
$\ell_2$-accurate estimates of the (Stein) score functions. For a popular
deterministic sampler (based on the probability flow ODE), we establish a
convergence rate proportional to $1/T$ (with $T$ the total number of steps),
improving upon past results; for another mainstream stochastic sampler (i.e., a
type of the denoising diffusion probabilistic model), we derive a convergence
rate proportional to $1/\sqrt{T}$, matching the state-of-the-art theory.
Imposing only minimal assumptions on the target data distribution (e.g., no
smoothness assumption is imposed), our results characterize how $\ell_2$ score
estimation errors affect the quality of the data generation processes. In
contrast to prior works, our theory is developed based on an elementary yet
versatile non-asymptotic approach without resorting to toolboxes for SDEs and
ODEs. Further, we design two accelerated variants, improving the convergence to
$1/T^2$ for the ODE-based sampler and $1/T$ for the DDPM-type sampler, which
might be of independent theoretical and empirical interest.
Related papers
- Improved Convergence Rate for Diffusion Probabilistic Models [7.237817437521988]
Score-based diffusion models have achieved remarkable empirical performance in the field of machine learning and artificial intelligence.
Despite a lot of theoretical attempts, there still exists significant gap between theory and practice.
We establish an iteration complexity at the order of $d2/3varepsilon-2/3$, which is better than $d5/12varepsilon-1$.
Our theory accommodates $varepsilon$-accurate score estimates, and does not require log-concavity on the target distribution.
arXiv Detail & Related papers (2024-10-17T16:37:33Z) - $O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions [6.76974373198208]
We establish a fast convergence theory for a popular SDE-based sampler under minimal assumptions.
Our analysis shows that, provided $ell_2$-accurate estimates of the score functions, the total variation distance between the target and generated distributions is upper bounded by $O(d/T)$.
This is achieved through a novel set of analytical tools that provides a fine-grained characterization of how the error propagates at each step of the reverse process.
arXiv Detail & Related papers (2024-09-27T17:59:10Z) - Non-asymptotic bounds for forward processes in denoising diffusions: Ornstein-Uhlenbeck is hard to beat [49.1574468325115]
This paper presents explicit non-asymptotic bounds on the forward diffusion error in total variation (TV)
We parametrise multi-modal data distributions in terms of the distance $R$ to their furthest modes and consider forward diffusions with additive and multiplicative noise.
arXiv Detail & Related papers (2024-08-25T10:28:31Z) - A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models [45.60426164657739]
We develop non-asymptotic convergence theory for a diffusion-based sampler.
We prove that $d/varepsilon$ are sufficient to approximate the target distribution to within $varepsilon$ total-variation distance.
Our results also characterize how $ell$ score estimation errors affect the quality of the data generation processes.
arXiv Detail & Related papers (2024-08-05T09:02:24Z) - Accelerating Convergence of Score-Based Diffusion Models, Provably [44.11766377798812]
Score-based diffusion models often suffer from low sampling speed due to extensive function evaluations needed during the sampling phase.
We design novel training-free algorithms to accelerate popular deterministic (i.e., DDIM) and (i.e., DDPM) samplers.
Our theory accommodates $ell$-accurate score estimates, and does not require log-concavity or smoothness on the target distribution.
arXiv Detail & Related papers (2024-03-06T17:02:39Z) - Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces.
We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z) - A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE.
We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.