Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models
- URL: http://arxiv.org/abs/2506.02318v1
- Date: Mon, 02 Jun 2025 23:14:35 GMT
- Title: Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models
- Authors: Yuchen Liang, Renxiang Huang, Lifeng Lai, Ness Shroff, Yingbin Liang,
- Abstract summary: We provide the first finite-time error bounds and convergence rate analysis for discrete diffusion models using absorbing rate matrices.<n>We establish the first convergence guarantees for both the $tau$-leaping and uniformization samplers under absorbing rate matrices.<n>Under suitable assumptions, we provide convergence guarantees without early stopping.
- Score: 59.47572583027685
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Discrete state space diffusion models have shown significant advantages in applications involving discrete data, such as text and image generation. It has also been observed that their performance is highly sensitive to the choice of rate matrices, particularly between uniform and absorbing rate matrices. While empirical results suggest that absorbing rate matrices often yield better generation quality compared to uniform rate matrices, existing theoretical works have largely focused on the uniform rate matrices case. Notably, convergence guarantees and error analyses for absorbing diffusion models are still missing. In this work, we provide the first finite-time error bounds and convergence rate analysis for discrete diffusion models using absorbing rate matrices. We begin by deriving an upper bound on the KL divergence of the forward process, introducing a surrogate initialization distribution to address the challenge posed by the absorbing stationary distribution, which is a singleton and causes the KL divergence to be ill-defined. We then establish the first convergence guarantees for both the $\tau$-leaping and uniformization samplers under absorbing rate matrices, demonstrating improved rates over their counterparts using uniform rate matrices. Furthermore, under suitable assumptions, we provide convergence guarantees without early stopping. Our analysis introduces several new technical tools to address challenges unique to absorbing rate matrices. These include a Jensen-type argument for bounding forward process convergence, novel techniques for bounding absorbing score functions, and a non-divergent upper bound on the score near initialization that removes the need of early-stopping.
Related papers
- Extreme value theory for singular subspace estimation in the matrix denoising model [0.4297070083645049]
We study fine-grained singular subspace estimation in the matrix denoising model.<n>We apply our distributional theory to test hypotheses of low-rank signal structure encoded in the leading singular vectors.
arXiv Detail & Related papers (2025-07-26T15:28:36Z) - Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.<n>We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.<n>Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach [49.97755400231656]
We show that a new accelerated DDPM sampler achieves accelerated performance for three broad distribution classes not considered before.<n>Our results show an improved dependency on the data dimension $d$ among accelerated DDPM type samplers.
arXiv Detail & Related papers (2024-02-21T16:11:47Z) - Beta Diffusion [69.61105403426778]
We introduce beta diffusion, a novel generative modeling method that integrates demasking and denoising to generate data within bounded ranges.
Beta diffusion is multiplicative and optimized with KL-divergence upper bounds (KLUBs) derived from the convexity of the KL divergence.
Experimental results on both synthetic data and natural images demonstrate the unique capabilities of beta diffusion in generative modeling of range-bounded data.
arXiv Detail & Related papers (2023-09-14T17:14:26Z) - Leave-one-out Singular Subspace Perturbation Analysis for Spectral
Clustering [7.342677574855651]
The singular subspaces perturbation theory is of fundamental importance in probability and statistics.
We consider two arbitrary matrices where one is a leave-one-column-out submatrix of the other one.
It is well-suited for mixture models and results in a sharper and finer statistical analysis than classical perturbation bounds such as Wedin's Theorem.
arXiv Detail & Related papers (2022-05-30T05:07:09Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Loss function based second-order Jensen inequality and its application
to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution.
PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models.
We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z) - Large Non-Stationary Noisy Covariance Matrices: A Cross-Validation
Approach [1.90365714903665]
We introduce a novel covariance estimator that exploits the heteroscedastic nature of financial time series.
By attenuating the noise from both the cross-sectional and time-series dimensions, we empirically demonstrate the superiority of our estimator over competing estimators.
arXiv Detail & Related papers (2020-12-10T15:41:17Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Covariance Estimation for Matrix-valued Data [9.739753590548796]
We propose a class of distribution-free regularized covariance estimation methods for high-dimensional matrix data.
We formulate a unified framework for estimating bandable covariance, and introduce an efficient algorithm based on rank one unconstrained Kronecker product approximation.
We demonstrate the superior finite-sample performance of our methods using simulations and real applications from a gridded temperature anomalies dataset and a S&P 500 stock data analysis.
arXiv Detail & Related papers (2020-04-11T02:15:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.