Related papers: $O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions

$O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions

URL: http://arxiv.org/abs/2409.18959v1
Date: Fri, 27 Sep 2024 17:59:10 GMT
Title: $O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions
Authors: Gen Li, Yuling Yan,
Abstract summary: We establish a fast convergence theory for a popular SDE-based sampler under minimal assumptions. Our analysis shows that, provided $ell_2$-accurate estimates of the score functions, the total variation distance between the target and generated distributions is upper bounded by $O(d/T)$. This is achieved through a novel set of analytical tools that provides a fine-grained characterization of how the error propagates at each step of the reverse process.
Score: 6.76974373198208
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Score-based diffusion models, which generate new data by learning to reverse a diffusion process that perturbs data from the target distribution into noise, have achieved remarkable success across various generative tasks. Despite their superior empirical performance, existing theoretical guarantees are often constrained by stringent assumptions or suboptimal convergence rates. In this paper, we establish a fast convergence theory for a popular SDE-based sampler under minimal assumptions. Our analysis shows that, provided $\ell_{2}$-accurate estimates of the score functions, the total variation distance between the target and generated distributions is upper bounded by $O(d/T)$ (ignoring logarithmic factors), where $d$ is the data dimensionality and $T$ is the number of steps. This result holds for any target distribution with finite first-order moment. To our knowledge, this improves upon existing convergence theory for both the SDE-based sampler and another ODE-based sampler, while imposing minimal assumptions on the target data distribution and score estimates. This is achieved through a novel set of analytical tools that provides a fine-grained characterization of how the error propagates at each step of the reverse process.

Related papers

Minimax Optimality of the Probability Flow ODE for Diffusion Models [8.15094483029656]
This work develops the first end-to-end theoretical framework for deterministic ODE-based samplers. We propose a smooth regularized score estimator that simultaneously controls both the $L2$ score error and the associated mean Jacobian error. We demonstrate that the resulting sampler achieves the minimax rate in total variation distance, modulo logarithmic factors.
arXiv Detail & Related papers (2025-03-12T17:51:29Z)
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers. We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions. This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z)
Non-asymptotic bounds for forward processes in denoising diffusions: Ornstein-Uhlenbeck is hard to beat [49.1574468325115]
This paper presents explicit non-asymptotic bounds on the forward diffusion error in total variation (TV) We parametrise multi-modal data distributions in terms of the distance $R$ to their furthest modes and consider forward diffusions with additive and multiplicative noise.
arXiv Detail & Related papers (2024-08-25T10:28:31Z)
A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models [45.60426164657739]
We develop non-asymptotic convergence theory for a diffusion-based sampler. We prove that $d/varepsilon$ are sufficient to approximate the target distribution to within $varepsilon$ total-variation distance. Our results also characterize how $ell$ score estimation errors affect the quality of the data generation processes.
arXiv Detail & Related papers (2024-08-05T09:02:24Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions. We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance. Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models [6.76974373198208]
We find that the dependency of the error incurred within each denoising step on the ambient dimension $d$ is in general unavoidable. This represents the first theoretical demonstration that the DDPM sampler can adapt to unknown low-dimensional structures in the target distribution.
arXiv Detail & Related papers (2024-05-23T17:59:10Z)
Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach [49.97755400231656]
We show that a novel accelerated DDPM sampler achieves accelerated performance for three broad distribution classes not considered before. Our results show an improved dependency on the data dimension $d$ among accelerated DDPM type samplers.
arXiv Detail & Related papers (2024-02-21T16:11:47Z)
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces. We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Improved Convergence of Score-Based Diffusion Models via Prediction-Correction [15.772322871598085]
Score-based generative models (SGMs) are powerful tools to sample from complex data distributions. This paper addresses the issue by considering a version of the popular predictor-corrector scheme. We first estimate the final distribution via an inexact Langevin dynamics and then revert the process.
arXiv Detail & Related papers (2023-05-23T15:29:09Z)
Diffusion Models are Minimax Optimal Distribution Estimators [49.47503258639454]
We provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling. We show that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates.
arXiv Detail & Related papers (2023-03-03T11:31:55Z)
Convergence for score-based generative modeling with polynomial complexity [9.953088581242845]
We prove the first convergence guarantees for the core mechanic behind Score-based generative modeling. Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality. We show that a predictor-corrector gives better convergence than using either portion alone.
arXiv Detail & Related papers (2022-06-13T14:57:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.