Related papers: Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis

Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis

URL: http://arxiv.org/abs/2410.02321v1
Date: Thu, 3 Oct 2024 09:07:13 GMT
Title: Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis
Authors: Zikun Zhang, Zixiang Chen, Quanquan Gu,
Abstract summary: We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework. We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points. Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
Score: 56.442307356162864
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models have achieved great success in generating high-dimensional samples across various applications. While the theoretical guarantees for continuous-state diffusion models have been extensively studied, the convergence analysis of the discrete-state counterparts remains under-explored. In this paper, we study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework. We introduce a discrete-time sampling algorithm in the general state space $[S]^d$ that utilizes score estimators at predefined time points. We derive convergence bounds for the Kullback-Leibler (KL) divergence and total variation (TV) distance between the generated sample distribution and the data distribution, considering both scenarios with and without early stopping under specific assumptions. Notably, our KL divergence bounds are nearly linear in dimension $d$, aligning with state-of-the-art results for diffusion models. Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function, which are essential for characterizing the discrete-time sampling process.

Related papers

Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [62.640128548633946]
We introduce a novel inference-time scaling approach based on particle Gibbs sampling for discrete diffusion models.<n>Our method consistently outperforms prior inference-time strategies on reward-guided text generation tasks.
arXiv Detail & Related papers (2025-07-11T08:00:47Z)
Continuous Diffusion Model for Language Modeling [57.396578974401734]
Existing continuous diffusion models for discrete data have limited performance compared to discrete approaches. We propose a continuous diffusion model for language modeling that incorporates the geometry of the underlying categorical distribution.
arXiv Detail & Related papers (2025-02-17T08:54:29Z)
Finite-Time Analysis of Discrete-Time Stochastic Interpolants [32.27430900126022]
We present the first discrete-time analysis of the interpolant framework, where we derive a finite-time upper bound on its distribution estimation error. Our result provides a novel way to design efficient schedules for convergence acceleration.
arXiv Detail & Related papers (2025-02-13T10:07:35Z)
Information-Theoretic Proofs for Diffusion Sampling [13.095978794717007]
This paper provides an elementary, self-contained analysis of diffusion-based sampling methods for generative modeling. We show that, if the diffusion step sizes are chosen sufficiently small, then the sampling distribution is provably close to the target distribution. Our results also provide a transparent view on how to accelerate convergence by introducing additional randomness in each step to match higher order moments in the comparison process.
arXiv Detail & Related papers (2025-02-04T13:19:21Z)
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers. We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions. This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z)
How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework [11.71206628091551]
We propose a comprehensive framework for the error analysis of discrete diffusion models based on L'evy-type integrals. Our framework unifies and strengthens the current theoretical results on discrete diffusion models.
arXiv Detail & Related papers (2024-10-04T16:59:29Z)
Convergence Analysis of Discrete Diffusion Model: Exact Implementation through Uniformization [17.535229185525353]
We introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points. Our results align with state-of-the-art achievements for diffusion models in $mathbbRd$ and further underscore the advantages of discrete diffusion models in comparison to the $mathbbRd$ setting.
arXiv Detail & Related papers (2024-02-12T22:26:52Z)
A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE. We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z)
Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces [0.0]
We develop a theoretical formulation for arbitrary discrete-state Markov processes in the forward diffusion process. As an example, we introduce Blackout Diffusion'', which learns to produce samples from an empty image instead of from noise.
arXiv Detail & Related papers (2023-05-18T16:24:12Z)
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z)
Mathematical analysis of singularities in the diffusion model under the submanifold assumption [0.0]
We show that the analytical mean drift function in DDPM and the score function in SGMally blow up in the final stages of the sampling process for singular data distributions. We derive a new target function and associated loss, which remains bounded even for singular data distributions.
arXiv Detail & Related papers (2023-01-19T05:13:03Z)
Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain. We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions. We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z)
Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region. Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.