Theory of Speciation Transitions in Diffusion Models with General Class Structure
- URL: http://arxiv.org/abs/2602.04404v1
- Date: Wed, 04 Feb 2026 10:35:47 GMT
- Title: Theory of Speciation Transitions in Diffusion Models with General Class Structure
- Authors: Beatrice Achilli, Marco Benedetti, Giulio Biroli, Marc Mézard,
- Abstract summary: Diffusion models generate data by reversing a diffusion process, transforming noise into structured samples drawn from a target distribution.<n>Recent theoretical work has shown that this backward dynamics can undergo sharp qualitative transitions, known as speciation transitions.<n>We develop a general theory of speciation in diffusion models that applies to arbitrary target distributions admitting well-defined classes.
- Score: 5.939780039158003
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Diffusion Models generate data by reversing a stochastic diffusion process, progressively transforming noise into structured samples drawn from a target distribution. Recent theoretical work has shown that this backward dynamics can undergo sharp qualitative transitions, known as speciation transitions, during which trajectories become dynamically committed to data classes. Existing theoretical analyses, however, are limited to settings where classes are identifiable through first moments, such as mixtures of Gaussians with well-separated means. In this work, we develop a general theory of speciation in diffusion models that applies to arbitrary target distributions admitting well-defined classes. We formalize the notion of class structure through Bayes classification and characterize speciation times in terms of free-entropy difference between classes. This criterion recovers known results in previously studied Gaussian-mixture models, while extending to situations in which classes are not distinguishable by first moments and may instead differ through higher-order or collective features. Our framework also accommodates multiple classes and predicts the existence of successive speciation times associated with increasingly fine-grained class commitment. We illustrate the theory on two analytically tractable examples: mixtures of one-dimensional Ising models at different temperatures and mixtures of zero-mean Gaussians with distinct covariance structures. In the Ising case, we obtain explicit expressions for speciation times by mapping the problem onto a random-field Ising model and solving it via the replica method. Our results provide a unified and broadly applicable description of speciation transitions in diffusion-based generative models.
Related papers
- Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation [56.361076943802594]
CanonFlow achieves state-of-the-art performance on the challenging GEOM-DRUG dataset, and the advantage remains large in few-step generation.
arXiv Detail & Related papers (2026-02-16T18:58:55Z) - Diffusion Model's Generalization Can Be Characterized by Inductive Biases toward a Data-Dependent Ridge Manifold [19.059115911590776]
We explicitly characterize what diffusion model generates, by proposing a log-density ridge manifold.<n>We show how the generated data relate to this manifold as inference dynamics progresses.<n>More detailed understanding of training dynamics will lead to more accurate quantification of the generation inductive bias.
arXiv Detail & Related papers (2026-02-05T18:55:03Z) - Foundations of Diffusion Models in General State Spaces: A Self-Contained Introduction [54.95522167029998]
This article is a self-contained primer on diffusion over general state spaces.<n>We develop the discrete-time view (forward noising via Markov kernels and learned reverse dynamics) alongside its continuous-time limits.<n>A common variational treatment yields the ELBO that underpins standard training losses.
arXiv Detail & Related papers (2025-12-04T18:55:36Z) - Can Diffusion Models Disentangle? A Theoretical Perspective [37.21661224725838]
This paper presents a novel theoretical framework for understanding how diffusion models can learn disentangled representations.<n>We establish identifiability conditions for general disentangled latent variable models, analyze training dynamics, and derive sample complexity bounds for disentangled latent subspace models.
arXiv Detail & Related papers (2025-03-31T20:46:18Z) - A solvable generative model with a linear, one-step denoiser [0.0]
We develop an analytically tractable single-step diffusion model based on a linear denoiser and present an explicit formula for the Kullback-Leibler divergence.<n>For large-scale practical diffusion models, we explain why a higher number of diffusion steps enhances production quality.
arXiv Detail & Related papers (2024-11-26T19:00:01Z) - Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.<n>We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.<n>Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Convergence Analysis of Discrete Diffusion Model: Exact Implementation
through Uniformization [17.535229185525353]
We introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points.
Our results align with state-of-the-art achievements for diffusion models in $mathbbRd$ and further underscore the advantages of discrete diffusion models in comparison to the $mathbbRd$ setting.
arXiv Detail & Related papers (2024-02-12T22:26:52Z) - On the Generalization Properties of Diffusion Models [31.067038651873126]
This work embarks on a comprehensive theoretical exploration of the generalization attributes of diffusion models.<n>We establish theoretical estimates of the generalization gap that evolves in tandem with the training dynamics of score-based diffusion models.<n>We extend our quantitative analysis to a data-dependent scenario, wherein target distributions are portrayed as a succession of densities.
arXiv Detail & Related papers (2023-11-03T09:20:20Z) - Renormalizing Diffusion Models [0.7252027234425334]
We use diffusion models to learn inverse renormalization group flows of statistical and quantum field theories.
Our work provides an interpretation of multiscale diffusion models, and gives physically-inspired suggestions for diffusion models which should have novel properties.
arXiv Detail & Related papers (2023-08-23T18:02:31Z) - Nonparametric plug-in classifier for multiclass classification of S.D.E.
paths [2.1301560294088318]
We study the multiclass classification problem where the features come from the mixture of time-homogeneous diffusions.
Specifically, the classes are discriminated by their drift functions while the diffusion coefficient is common to all classes and unknown.
arXiv Detail & Related papers (2022-12-20T14:08:05Z) - On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data.
Invariance measures consistency of model predictions on transformations of the data.
From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z) - Why do classifier accuracies show linear trends under distribution
shift? [58.40438263312526]
accuracies of models on one data distribution are approximately linear functions of the accuracies on another distribution.
We assume the probability that two models agree in their predictions is higher than what we can infer from their accuracy levels alone.
We show that a linear trend must occur when evaluating models on two distributions unless the size of the distribution shift is large.
arXiv Detail & Related papers (2020-12-31T07:24:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.