Related papers: Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees

Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees

URL: http://arxiv.org/abs/2509.16756v2
Date: Fri, 31 Oct 2025 16:25:35 GMT
Title: Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees
Authors: Yuchen Liang, Yingbin Liang, Lifeng Lai, Ness Shroff,
Abstract summary: We introduce a new analytical approach for discrete diffusion models that removes the need for regularity assumptions.<n>For the standard $tau$-leaping method, we establish convergence guarantees in KL divergence that scale linearly with vocabulary size.<n>Our approach is also more broadly applicable: it provides the first convergence guarantees for other widely used samplers.
Score: 70.88473359544084
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Discrete diffusion models have recently gained significant prominence in applications involving natural language and graph data. A key factor influencing their effectiveness is the efficiency of discretized samplers. Among these, $\tau$-leaping samplers have become particularly popular due to their theoretical and empirical success. However, existing theoretical analyses of $\tau$-leaping often rely on somewhat restrictive and difficult-to-verify regularity assumptions, and their convergence bounds contain quadratic dependence on the vocabulary size. In this work, we introduce a new analytical approach for discrete diffusion models that removes the need for such assumptions. For the standard $\tau$-leaping method, we establish convergence guarantees in KL divergence that scale linearly with vocabulary size, improving upon prior results with quadratic dependence. Our approach is also more broadly applicable: it provides the first convergence guarantees for other widely used samplers, including the Euler method and Tweedie $\tau$-leaping. Central to our approach is a novel technique based on differential inequalities, offering a more flexible alternative to the traditional Girsanov change-of-measure methods. This technique may also be of independent interest for the analysis of other stochastic processes.

Related papers

Bridge Matching Sampler: Scalable Sampling via Generalized Fixed-Point Diffusion Matching [38.70740405520393]
Bridge Matching Sampler (BMS) enables learning a transport map between arbitrary prior and target distributions with a single, scalable, and stable objective.<n>We demonstrate that our method enables sampling at unprecedented scales while preserving mode diversity, achieving state-of-the-art results on complex synthetic densities and high-dimensional molecular benchmarks.
arXiv Detail & Related papers (2026-02-28T08:00:38Z)
Sharp Convergence Rates for Masked Diffusion Models [53.117058231393834]
We develop a total-variation based analysis for the Euler method that overcomes limitations.<n>Our results relax assumptions on score estimation, improve parameter dependencies, and establish convergence guarantees.<n>Overall, our analysis introduces a direct TV-based error decomposition along the CTMC trajectory and a decoupling-based path-wise analysis for FHS.
arXiv Detail & Related papers (2026-02-26T00:47:51Z)
Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [62.640128548633946]
We introduce a novel inference-time scaling approach based on particle Gibbs sampling for discrete diffusion models.<n>Our method consistently outperforms prior inference-time strategies on reward-guided text generation tasks.
arXiv Detail & Related papers (2025-07-11T08:00:47Z)
Generalized Interpolating Discrete Diffusion [65.74168524007484]
Masked diffusion is a popular choice due to its simplicity and effectiveness.<n>We generalize a new family of general interpolating discrete diffusion (GIDD) which offers greater flexibility in the design of the noising processes.<n>Exploiting GIDD's flexibility, we explore a hybrid approach combining masking and uniform noise, leading to improved sample quality.
arXiv Detail & Related papers (2025-03-06T14:30:55Z)
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics [7.873510219469276]
We introduce two novel training methods for discrete diffusion samplers.<n>These methods yield memory-efficient training and achieve state-of-the-art results in unsupervised optimization.<n>We introduce adaptations of SN-NIS and Neural Chain Monte Carlo that enable for the first time the application of discrete diffusion models to this problem.
arXiv Detail & Related papers (2025-02-12T18:59:55Z)
A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization [23.972397132797116]
Current deep learning approaches rely on generative models that yield exact sample likelihoods.<n>This work introduces a method that lifts this restriction and opens the possibility to employ highly expressive latent variable models.<n>We experimentally validate our approach in data-free Combinatorial Optimization and demonstrate that our method achieves a new state-of-the-art on a wide range of benchmark problems.
arXiv Detail & Related papers (2024-06-03T17:55:02Z)
FEMDA: a unified framework for discriminant analysis [4.6040036610482655]
We present a novel approach to deal with non-Gaussian datasets. The model considered is an arbitraryly Symmetrical (ES) distribution per cluster with its own arbitrary scale parameter. By deriving a new decision rule, we demonstrate that maximum-likelihood parameter estimation and classification are simple, efficient, and robust compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-11-13T17:59:37Z)
Towards Better Certified Segmentation via Diffusion Models [62.21617614504225]
segmentation models can be vulnerable to adversarial perturbations, which hinders their use in critical-decision systems like healthcare or autonomous driving. Recently, randomized smoothing has been proposed to certify segmentation predictions by adding Gaussian noise to the input to obtain theoretical guarantees. In this paper, we address the problem of certifying segmentation prediction using a combination of randomized smoothing and diffusion models.
arXiv Detail & Related papers (2023-06-16T16:30:39Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Federated Variational Inference Methods for Structured Latent Variable Models [1.0312968200748118]
Federated learning methods enable model training across distributed data sources without data leaving their original locations. We present a general and elegant solution based on structured variational inference, widely used in Bayesian machine learning. We also provide a communication-efficient variant analogous to the canonical FedAvg algorithm.
arXiv Detail & Related papers (2023-02-07T08:35:04Z)
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution. We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks. In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings. We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.