Related papers: Entropy-Based Dimension-Free Convergence and Loss-Adaptive Schedules for Diffusion Models

Entropy-Based Dimension-Free Convergence and Loss-Adaptive Schedules for Diffusion Models

URL: http://arxiv.org/abs/2601.21943v1
Date: Thu, 29 Jan 2026 16:28:21 GMT
Title: Entropy-Based Dimension-Free Convergence and Loss-Adaptive Schedules for Diffusion Models
Authors: Ahmad Aghapour, Erhan Bayraktar, Ziqing Zhang,
Abstract summary: Diffusion generative models synthesize samples by discretizing reverse-time dynamics driven by a learned score (or denoiser)<n>We develop an information-theoretic approach to dimension-free convergence that avoids geometric assumptions.<n>We also propose a Loss-Adaptive Schedule (LAS) for efficient discretization of reverse SDE which is lightweight and relies only on the training loss.
Score: 3.2091923314854416
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion generative models synthesize samples by discretizing reverse-time dynamics driven by a learned score (or denoiser). Existing convergence analyses of diffusion models typically scale at least linearly with the ambient dimension, and sharper rates often depend on intrinsic-dimension assumptions or other geometric restrictions on the target distribution. We develop an alternative, information-theoretic approach to dimension-free convergence that avoids any geometric assumptions. Under mild assumptions on the target distribution, we bound KL divergence between the target and generated distributions by $O(H^2/K)$ (up to endpoint factors), where $H$ is the Shannon entropy and $K$ is the number of sampling steps. Moreover, using a reformulation of the KL divergence, we propose a Loss-Adaptive Schedule (LAS) for efficient discretization of reverse SDE which is lightweight and relies only on the training loss, requiring no post-training heavy computation. Empirically, LAS improves sampling quality over common heuristic schedules.

Related papers

Efficient Sampling with Discrete Diffusion Models: Sharp and Adaptive Guarantees [9.180350432640912]
We study the sampling efficiency of score-based discrete diffusion models under a continuous-time Markov chain (CTMC) formulation.<n>For uniform discrete diffusion, we show that the $$-leaping algorithm achieves an complexity of order $tilde O(d/varepsilon)$.<n>For masking discrete diffusion, we introduce a modified $$-leaping sampler whose convergence rate is governed by an intrinsic information-theoretic quantity.
arXiv Detail & Related papers (2026-02-16T18:48:17Z)
Fast Sampling for Flows and Diffusions with Lazy and Point Mass Stochastic Interpolants [5.492889521988414]
We prove how to convert a sample path of a differential equation (SDE) with arbitrary diffusion coefficient under any schedule.<n>We then extend the interpolant framework to admit a larger class of point mass schedules.
arXiv Detail & Related papers (2026-02-03T17:48:34Z)
An Elementary Approach to Scheduling in Generative Diffusion Models [55.171367482496755]
An elementary approach to characterizing the impact of noise scheduling and time discretization in generative diffusion models is developed.<n> Experiments across different datasets and pretrained models demonstrate that the time discretization strategy selected by our approach consistently outperforms baseline and search-based strategies.
arXiv Detail & Related papers (2026-01-20T05:06:26Z)
Adaptive Symmetrization of the KL Divergence [10.632997610787207]
Many tasks in machine learning can be described as or reduced to learning a probability distribution given a finite set of samples.<n>A common approach is to minimize a statistical divergence between the (empirical) data distribution and a parameterized distribution, e.g., a normalizing flow (NF) or an energy-based model (EBM)
arXiv Detail & Related papers (2025-11-14T10:41:59Z)
Generative Latent Neural PDE Solver using Flow Matching [8.397730500554047]
We propose a latent diffusion model for PDE simulation that embeds the PDE state in a lower-dimensional latent space.<n>Our framework uses an autoencoder to map different types of meshes onto a unified structured latent grid, capturing complex geometries.<n> Numerical experiments show that the proposed model outperforms several deterministic baselines in both accuracy and long-term stability.
arXiv Detail & Related papers (2025-03-28T16:44:28Z)
Low-dimensional adaptation of diffusion models: Convergence in total variation [13.218641525691195]
We investigate how diffusion generative models leverage (unknown) low-dimensional structure to accelerate sampling.<n>Our findings provide the first rigorous evidence for the adaptivity of the DDIM-type samplers to unknown low-dimensional structure.
arXiv Detail & Related papers (2025-01-22T16:12:33Z)
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional dependencies for general score-mismatched diffusion samplers.<n>We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.<n>This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z)
Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.<n>We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.<n>Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z)
O(d/T) Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions [6.76974373198208]
We establish a fast convergence theory for the denoising diffusion probabilistic model (DDPM) under minimal assumptions.<n>We show that the convergence rate improves to $O(k/T)$, where $k$ is the intrinsic dimension of the target data distribution.<n>This highlights the ability of DDPM to automatically adapt to unknown low-dimensional structures.
arXiv Detail & Related papers (2024-09-27T17:59:10Z)
On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z)
Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution. We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.