A Unification of Discrete, Gaussian, and Simplicial Diffusion
- URL: http://arxiv.org/abs/2512.15923v1
- Date: Wed, 17 Dec 2025 19:39:33 GMT
- Title: A Unification of Discrete, Gaussian, and Simplicial Diffusion
- Authors: Nuria Alina Chandra, Yucen Lily Li, Alan N. Amin, Alex Ali, Joshua Rollins, Sebastian W. Ober, Aniruddh Raghu, Andrew Gordon Wilson,
- Abstract summary: We show that Wright-Fisher simplicial diffusion is more stable and outperforms previous simplicial diffusion models on conditional DNA generation.<n>We also show that we can train models on multiple domains at once that are competitive with models trained on any individual domain.
- Score: 32.86558714543654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To model discrete sequences such as DNA, proteins, and language using diffusion, practitioners must choose between three major methods: diffusion in discrete space, Gaussian diffusion in Euclidean space, or diffusion on the simplex. Despite their shared goal, these models have disparate algorithms, theoretical structures, and tradeoffs: discrete diffusion has the most natural domain, Gaussian diffusion has more mature algorithms, and diffusion on the simplex in principle combines the strengths of the other two but in practice suffers from a numerically unstable stochastic processes. Ideally we could see each of these models as instances of the same underlying framework, and enable practitioners to switch between models for downstream applications. However previous theories have only considered connections in special cases. Here we build a theory unifying all three methods of discrete diffusion as different parameterizations of the same underlying process: the Wright-Fisher population genetics model. In particular, we find simplicial and Gaussian diffusion as two large-population limits. Our theory formally connects the likelihoods and hyperparameters of these models and leverages decades of mathematical genetics literature to unlock stable simplicial diffusion. Finally, we relieve the practitioner of balancing model trade-offs by demonstrating it is possible to train a single model that can perform diffusion in any of these three domains at test time. Our experiments show that Wright-Fisher simplicial diffusion is more stable and outperforms previous simplicial diffusion models on conditional DNA generation. We also show that we can train models on multiple domains at once that are competitive with models trained on any individual domain.
Related papers
- Foundations of Diffusion Models in General State Spaces: A Self-Contained Introduction [54.95522167029998]
This article is a self-contained primer on diffusion over general state spaces.<n>We develop the discrete-time view (forward noising via Markov kernels and learned reverse dynamics) alongside its continuous-time limits.<n>A common variational treatment yields the ELBO that underpins standard training losses.
arXiv Detail & Related papers (2025-12-04T18:55:36Z) - Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion [44.51145995160038]
Markov processes evolve by discontinuous jumps at a fixed rate.<n>Unlike other discrete diffusion models, masking diffusion builds in the known distribution of jump times and only learns where to jump to.
arXiv Detail & Related papers (2025-06-10T00:58:25Z) - Resolving Memorization in Empirical Diffusion Model for Manifold Data in High-Dimensional Spaces [5.716752583983991]
When the data distribution consists of n points, empirical diffusion models tend to reproduce existing data points.<n>This work shows that the memorization issue can be solved simply by applying an inertia update at the end of the empirical diffusion simulation.<n>We demonstrate that the distribution of samples from this model approximates the true data distribution on a $C2$ manifold of dimension $d$, within a Wasserstein-1 distance of order $O(n-frac2d+4)$.
arXiv Detail & Related papers (2025-05-05T09:40:41Z) - Continuous Diffusion Model for Language Modeling [64.7425225935854]
Existing continuous diffusion models for discrete data underperform compared to discrete methods.<n>We propose a continuous diffusion model for language modeling that incorporates the geometry of the underlying categorical distribution.<n>Our method outperforms existing discrete diffusion models and approaches the performance of autoregressive models.
arXiv Detail & Related papers (2025-02-17T08:54:29Z) - Guided Diffusion from Self-Supervised Diffusion Features [49.78673164423208]
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or pretraining.
We propose a framework to extract guidance from, and specifically for, diffusion models.
arXiv Detail & Related papers (2023-12-14T11:19:11Z) - Renormalizing Diffusion Models [0.7252027234425334]
We use diffusion models to learn inverse renormalization group flows of statistical and quantum field theories.
Our work provides an interpretation of multiscale diffusion models, and gives physically-inspired suggestions for diffusion models which should have novel properties.
arXiv Detail & Related papers (2023-08-23T18:02:31Z) - Lipschitz Singularities in Diffusion Models [64.28196620345808]
Diffusion models often display the infinite Lipschitz property of the network with respect to time variable near the zero point.<n>We propose a novel approach, dubbed E-TSDM, which alleviates the Lipschitz singularities of the diffusion model near the zero point.<n>Our work may advance the understanding of the general diffusion process, and also provide insights for the design of diffusion models.
arXiv Detail & Related papers (2023-06-20T03:05:28Z) - Diffusion Models are Minimax Optimal Distribution Estimators [49.47503258639454]
We provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling.
We show that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates.
arXiv Detail & Related papers (2023-03-03T11:31:55Z) - Unifying Diffusion Models' Latent Space, with Applications to
CycleDiffusion and Guidance [95.12230117950232]
We show that a common latent space emerges from two diffusion models trained independently on related domains.
Applying CycleDiffusion to text-to-image diffusion models, we show that large-scale text-to-image diffusion models can be used as zero-shot image-to-image editors.
arXiv Detail & Related papers (2022-10-11T15:53:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.