Related papers: Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees

Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees

URL: http://arxiv.org/abs/2505.01049v2
Date: Sun, 25 May 2025 04:06:45 GMT
Title: Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees
Authors: Nishant Jain, Xunpeng Huang, Yian Ma, Tong Zhang,
Abstract summary: We provide a theoretical analysis of consistency models capable of mapping inputs at a given time to arbitrary points along the reverse trajectory.<n>We show that one can achieve a KL divergence of order $ O(varepsilon2) $ using only $ Oleft(logleft(fracdvarepsilonright) $ iterations with a constant step size.<n>We conclude that accurate learning is feasible using small discretization steps, both in smooth and non-smooth settings.
Score: 15.366598179769918
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Consistency models have recently emerged as a compelling alternative to traditional SDE-based diffusion models. They offer a significant acceleration in generation by producing high-quality samples in very few steps. Despite their empirical success, a proper theoretic justification for their speed-up is still lacking. In this work, we address the gap by providing a theoretical analysis of consistency models capable of mapping inputs at a given time to arbitrary points along the reverse trajectory. We show that one can achieve a KL divergence of order $ O(\varepsilon^2) $ using only $ O\left(\log\left(\frac{d}{\varepsilon}\right)\right) $ iterations with a constant step size. Additionally, under minimal assumptions on the data distribution (non smooth case) an increasingly common setting in recent diffusion model analyses we show that a similar KL convergence guarantee can be obtained, with the number of steps scaling as $ O\left(d \log\left(\frac{d}{\varepsilon}\right)\right) $. Going further, we also provide a theoretical analysis for estimation of such consistency models, concluding that accurate learning is feasible using small discretization steps, both in smooth and non-smooth settings. Notably, our results for the non-smooth case yield best in class convergence rates compared to existing SDE or ODE based analyses under minimal assumptions.

Related papers

Beyond Scores: Proximal Diffusion Models [10.27283386401996]
We develop Proximal Diffusion Models (ProxDM) to learn proximal operators of the log-density.<n>We show that two variants of ProxDM achieve significantly faster within just a few sampling steps compared to conventional score-matching methods.
arXiv Detail & Related papers (2025-07-11T18:30:09Z)
Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration [5.548787731232499]
We focus on the Wasserstein convergence analysis of score-based diffusion models.<n>We compare various discretization schemes, including Euler discretization, exponential midpoint and randomization methods.<n>We propose an accelerated sampler based on the local linearization method.
arXiv Detail & Related papers (2025-02-07T11:37:51Z)
Improved Convergence Rate for Diffusion Probabilistic Models [7.237817437521988]
Score-based diffusion models have achieved remarkable empirical performance in the field of machine learning and artificial intelligence. Despite a lot of theoretical attempts, there still exists significant gap between theory and practice. We establish an iteration complexity at the order of $d2/3varepsilon-2/3$, which is better than $d5/12varepsilon-1$. Our theory accommodates $varepsilon$-accurate score estimates, and does not require log-concavity on the target distribution.
arXiv Detail & Related papers (2024-10-17T16:37:33Z)
Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.<n>We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.<n>Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z)
O(d/T) Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions [6.76974373198208]
We establish a fast convergence theory for the denoising diffusion probabilistic model (DDPM) under minimal assumptions.<n>We show that the convergence rate improves to $O(k/T)$, where $k$ is the intrinsic dimension of the target data distribution.<n>This highlights the ability of DDPM to automatically adapt to unknown low-dimensional structures.
arXiv Detail & Related papers (2024-09-27T17:59:10Z)
A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models [45.60426164657739]
We develop non-asymptotic convergence theory for a diffusion-based sampler. We prove that $d/varepsilon$ are sufficient to approximate the target distribution to within $varepsilon$ total-variation distance. Our results also characterize how $ell$ score estimation errors affect the quality of the data generation processes.
arXiv Detail & Related papers (2024-08-05T09:02:24Z)
Accelerating Diffusion Models with Parallel Sampling: Inference at Sub-Linear Time Complexity [11.71206628091551]
Diffusion models are costly to train and evaluate, reducing the inference cost for diffusion models remains a major goal. Inspired by the recent empirical success in accelerating diffusion models via the parallel sampling techniqueciteshih2024parallel, we propose to divide the sampling process into $mathcalO(1)$ blocks with parallelizable Picard iterations within each block. Our results shed light on the potential of fast and efficient sampling of high-dimensional data on fast-evolving modern large-memory GPU clusters.
arXiv Detail & Related papers (2024-05-24T23:59:41Z)
Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach [49.97755400231656]
We show that a new accelerated DDPM sampler achieves accelerated performance for three broad distribution classes not considered before.<n>Our results show an improved dependency on the data dimension $d$ among accelerated DDPM type samplers.
arXiv Detail & Related papers (2024-02-21T16:11:47Z)
Gaussian Mixture Solvers for Diffusion Models [84.83349474361204]
We introduce a novel class of SDE-based solvers called GMS for diffusion models. Our solver outperforms numerous SDE-based solvers in terms of sample quality in image generation and stroke-based synthesis.
arXiv Detail & Related papers (2023-11-02T02:05:38Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Improved Analysis of Score-based Generative Modeling: User-Friendly Bounds under Minimal Smoothness Assumptions [9.953088581242845]
We provide convergence guarantees with complexity for any data distribution with second-order moment. Our result does not rely on any log-concavity or functional inequality assumption. Our theoretical analysis provides comparison between different discrete approximations and may guide the choice of discretization points in practice.
arXiv Detail & Related papers (2022-11-03T15:51:00Z)
Sharper Rates and Flexible Framework for Nonconvex SGD with Client and Data Sampling [64.31011847952006]
We revisit the problem of finding an approximately stationary point of the average of $n$ smooth and possibly non-color functions. We generalize the $smallsfcolorgreen$ so that it can provably work with virtually any sampling mechanism. We provide the most general and most accurate analysis of optimal bound in the smooth non-color regime.
arXiv Detail & Related papers (2022-06-05T21:32:33Z)
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal [70.15267479220691]
We consider and analyze the sample complexity of model reinforcement learning with a generative variance-free model. Our analysis shows that it is nearly minimax-optimal for finding an $varepsilon$-optimal policy when $varepsilon$ is sufficiently small.
arXiv Detail & Related papers (2022-05-27T19:39:24Z)
Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave. At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.