Related papers: Accelerating Diffusion Models with Parallel Sampling: Inference at Sub-Linear Time Complexity

Accelerating Diffusion Models with Parallel Sampling: Inference at Sub-Linear Time Complexity

URL: http://arxiv.org/abs/2405.15986v1
Date: Fri, 24 May 2024 23:59:41 GMT
Title: Accelerating Diffusion Models with Parallel Sampling: Inference at Sub-Linear Time Complexity
Authors: Haoxuan Chen, Yinuo Ren, Lexing Ying, Grant M. Rotskoff,
Abstract summary: Diffusion models are costly to train and evaluate, reducing the inference cost for diffusion models remains a major goal. Inspired by the recent empirical success in accelerating diffusion models via the parallel sampling techniqueciteshih2024parallel, we propose to divide the sampling process into $mathcalO(1)$ blocks with parallelizable Picard iterations within each block. Our results shed light on the potential of fast and efficient sampling of high-dimensional data on fast-evolving modern large-memory GPU clusters.
Score: 11.71206628091551
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models have become a leading method for generative modeling of both image and scientific data. As these models are costly to train and evaluate, reducing the inference cost for diffusion models remains a major goal. Inspired by the recent empirical success in accelerating diffusion models via the parallel sampling technique~\cite{shih2024parallel}, we propose to divide the sampling process into $\mathcal{O}(1)$ blocks with parallelizable Picard iterations within each block. Rigorous theoretical analysis reveals that our algorithm achieves $\widetilde{\mathcal{O}}(\mathrm{poly} \log d)$ overall time complexity, marking the first implementation with provable sub-linear complexity w.r.t. the data dimension $d$. Our analysis is based on a generalized version of Girsanov's theorem and is compatible with both the SDE and probability flow ODE implementations. Our results shed light on the potential of fast and efficient sampling of high-dimensional data on fast-evolving modern large-memory GPU clusters.

Related papers

Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees [15.366598179769918]
We provide a theoretical analysis of consistency models capable of mapping inputs at a given time to arbitrary points along the reverse trajectory.<n>We show that one can achieve a KL divergence of order $ O(varepsilon2) $ using only $ Oleft(logleft(fracdvarepsilonright) $ iterations with a constant step size.<n>We conclude that accurate learning is feasible using small discretization steps, both in smooth and non-smooth settings.
arXiv Detail & Related papers (2025-05-02T06:50:46Z)
Self-Refining Diffusion Samplers: Enabling Parallelization via Parareal Iterations [53.180374639531145]
Self-Refining Diffusion Samplers (SRDS) retain sample quality and can improve latency at the cost of additional parallel compute. We take inspiration from the Parareal algorithm, a popular numerical method for parallel-in-time integration of differential equations.
arXiv Detail & Related papers (2024-12-11T11:08:09Z)
Parallel simulation for sampling under isoperimetry and score-based diffusion models [56.39904484784127]
As data size grows, reducing the iteration cost becomes an important goal. Inspired by the success of the parallel simulation of the initial value problem in scientific computation, we propose parallel Picard methods for sampling tasks. Our work highlights the potential advantages of simulation methods in scientific computation for dynamics-based sampling and diffusion models.
arXiv Detail & Related papers (2024-12-10T11:50:46Z)
Improved Convergence Rate for Diffusion Probabilistic Models [7.237817437521988]
Score-based diffusion models have achieved remarkable empirical performance in the field of machine learning and artificial intelligence. Despite a lot of theoretical attempts, there still exists significant gap between theory and practice. We establish an iteration complexity at the order of $d2/3varepsilon-2/3$, which is better than $d5/12varepsilon-1$. Our theory accommodates $varepsilon$-accurate score estimates, and does not require log-concavity on the target distribution.
arXiv Detail & Related papers (2024-10-17T16:37:33Z)
A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models [45.60426164657739]
We develop non-asymptotic convergence theory for a diffusion-based sampler. We prove that $d/varepsilon$ are sufficient to approximate the target distribution to within $varepsilon$ total-variation distance. Our results also characterize how $ell$ score estimation errors affect the quality of the data generation processes.
arXiv Detail & Related papers (2024-08-05T09:02:24Z)
On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z)
Accelerating Parallel Sampling of Diffusion Models [25.347710690711562]
We propose a novel approach that accelerates the sampling of diffusion models by parallelizing the autoregressive process. Applying these techniques, we introduce ParaTAA, a universal and training-free parallel sampling algorithm. Our experiments demonstrate that ParaTAA can decrease the inference steps required by common sequential sampling algorithms by a factor of 4$sim$14 times.
arXiv Detail & Related papers (2024-02-15T14:27:58Z)
Efficient Integrators for Diffusion Generative Models [22.01769257075573]
Diffusion models suffer from slow sample generation at inference time. We propose two complementary frameworks for accelerating sample generation in pre-trained models. We present a hybrid method that leads to the best-reported performance for diffusion models in augmented spaces.
arXiv Detail & Related papers (2023-10-11T21:04:42Z)
DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models [58.450152413700586]
We introduce a soft absorbing state that facilitates the diffusion model in learning to reconstruct discrete mutations based on the underlying Gaussian space. We employ state-of-the-art ODE solvers within the continuous space to expedite the sampling process. Our proposed method effectively accelerates the training convergence by 4x and generates samples of similar quality 800x faster.
arXiv Detail & Related papers (2023-10-09T15:29:10Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions. Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation. In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z)
A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE. We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z)
Fast Sampling of Diffusion Models via Operator Learning [74.37531458470086]
We use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models. Compared to other fast sampling methods that have a sequential nature, we are the first to propose a parallel decoding method. We show our method achieves state-of-the-art FID of 3.78 for CIFAR-10 and 7.83 for ImageNet-64 in the one-model-evaluation setting.
arXiv Detail & Related papers (2022-11-24T07:30:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.