Categorical SDEs with Simplex Diffusion
- URL: http://arxiv.org/abs/2210.14784v1
- Date: Wed, 26 Oct 2022 15:27:43 GMT
- Title: Categorical SDEs with Simplex Diffusion
- Authors: Pierre H. Richemond, Sander Dieleman, Arnaud Doucet
- Abstract summary: This theoretical note proposes Simplex Diffusion, a means to directly diffuse datapoints located on an n-dimensional probability simplex.
We show how this relates to the Dirichlet distribution on the simplex and how the analogous SDE is realized thanks to a multi-dimensional Cox-Ingersoll-Ross process.
- Score: 25.488210663637265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models typically operate in the standard framework of generative
modelling by producing continuously-valued datapoints. To this end, they rely
on a progressive Gaussian smoothing of the original data distribution, which
admits an SDE interpretation involving increments of a standard Brownian
motion. However, some applications such as text generation or reinforcement
learning might naturally be better served by diffusing categorical-valued data,
i.e., lifting the diffusion to a space of probability distributions. To this
end, this short theoretical note proposes Simplex Diffusion, a means to
directly diffuse datapoints located on an n-dimensional probability simplex. We
show how this relates to the Dirichlet distribution on the simplex and how the
analogous SDE is realized thanks to a multi-dimensional Cox-Ingersoll-Ross
process (abbreviated as CIR), previously used in economics and mathematical
finance. Finally, we make remarks as to the numerical implementation of
trajectories of the CIR process, and discuss some limitations of our approach.
Related papers
- Identifying Drift, Diffusion, and Causal Structure from Temporal Snapshots [10.018568337210876]
We present the first comprehensive approach for jointly estimating the drift and diffusion of an SDE from its temporal marginals.
We show that each of these steps areAlterally optimal with respect to the Kullback-Leibler datasets.
arXiv Detail & Related papers (2024-10-30T06:28:21Z) - A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models [45.60426164657739]
We develop non-asymptotic convergence theory for a diffusion-based sampler.
We prove that $d/varepsilon$ are sufficient to approximate the target distribution to within $varepsilon$ total-variation distance.
Our results also characterize how $ell$ score estimation errors affect the quality of the data generation processes.
arXiv Detail & Related papers (2024-08-05T09:02:24Z) - On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution.
In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z) - Convergence Analysis of Discrete Diffusion Model: Exact Implementation
through Uniformization [17.535229185525353]
We introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points.
Our results align with state-of-the-art achievements for diffusion models in $mathbbRd$ and further underscore the advantages of discrete diffusion models in comparison to the $mathbbRd$ setting.
arXiv Detail & Related papers (2024-02-12T22:26:52Z) - Diffusion on the Probability Simplex [24.115365081118604]
Diffusion models learn to reverse the progressive noising of a data distribution to create a generative model.
We propose a method of performing diffusion on the probability simplex.
We find that our methodology naturally extends to include diffusion on the unit cube which has applications for bounded image generation.
arXiv Detail & Related papers (2023-09-05T18:52:35Z) - A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE.
We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - GANs with Conditional Independence Graphs: On Subadditivity of
Probability Divergences [70.30467057209405]
Generative Adversarial Networks (GANs) are modern methods to learn the underlying distribution of a data set.
GANs are designed in a model-free fashion where no additional information about the underlying distribution is available.
We propose a principled design of a model-based GAN that uses a set of simple discriminators on the neighborhoods of the Bayes-net/MRF.
arXiv Detail & Related papers (2020-03-02T04:31:22Z) - Stein Variational Inference for Discrete Distributions [70.19352762933259]
We propose a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions.
Our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo.
We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN)
In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions.
arXiv Detail & Related papers (2020-03-01T22:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.