Categorical SDEs with Simplex Diffusion
- URL: http://arxiv.org/abs/2210.14784v1
- Date: Wed, 26 Oct 2022 15:27:43 GMT
- Title: Categorical SDEs with Simplex Diffusion
- Authors: Pierre H. Richemond, Sander Dieleman, Arnaud Doucet
- Abstract summary: This theoretical note proposes Simplex Diffusion, a means to directly diffuse datapoints located on an n-dimensional probability simplex.
We show how this relates to the Dirichlet distribution on the simplex and how the analogous SDE is realized thanks to a multi-dimensional Cox-Ingersoll-Ross process.
- Score: 25.488210663637265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models typically operate in the standard framework of generative
modelling by producing continuously-valued datapoints. To this end, they rely
on a progressive Gaussian smoothing of the original data distribution, which
admits an SDE interpretation involving increments of a standard Brownian
motion. However, some applications such as text generation or reinforcement
learning might naturally be better served by diffusing categorical-valued data,
i.e., lifting the diffusion to a space of probability distributions. To this
end, this short theoretical note proposes Simplex Diffusion, a means to
directly diffuse datapoints located on an n-dimensional probability simplex. We
show how this relates to the Dirichlet distribution on the simplex and how the
analogous SDE is realized thanks to a multi-dimensional Cox-Ingersoll-Ross
process (abbreviated as CIR), previously used in economics and mathematical
finance. Finally, we make remarks as to the numerical implementation of
trajectories of the CIR process, and discuss some limitations of our approach.
Related papers
- On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution.
In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z) - Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate [49.97755400231656]
We establish convergence guarantees for substantially larger classes of distributions under DT diffusion processes.
We then specialize our results to a number of interesting classes of distributions with explicit parameter dependencies.
We propose a novel accelerated sampler and show that it improves the convergence rates of the corresponding regular sampler by orders of magnitude with respect to all system parameters.
arXiv Detail & Related papers (2024-02-21T16:11:47Z) - Convergence Analysis of Discrete Diffusion Model: Exact Implementation
through Uniformization [17.535229185525353]
We introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points.
Our results align with state-of-the-art achievements for diffusion models in $mathbbRd$ and further underscore the advantages of discrete diffusion models in comparison to the $mathbbRd$ setting.
arXiv Detail & Related papers (2024-02-12T22:26:52Z) - Diffusion on the Probability Simplex [24.115365081118604]
Diffusion models learn to reverse the progressive noising of a data distribution to create a generative model.
We propose a method of performing diffusion on the probability simplex.
We find that our methodology naturally extends to include diffusion on the unit cube which has applications for bounded image generation.
arXiv Detail & Related papers (2023-09-05T18:52:35Z) - Score-based Generative Modeling Through Backward Stochastic Differential
Equations: Inversion and Generation [6.2255027793924285]
The proposed BSDE-based diffusion model represents a novel approach to diffusion modeling, which extends the application of differential equations (SDEs) in machine learning.
We demonstrate the theoretical guarantees of the model, the benefits of using Lipschitz networks for score matching, and its potential applications in various areas such as diffusion inversion, conditional diffusion, and uncertainty quantification.
arXiv Detail & Related papers (2023-04-26T01:15:35Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - GANs with Conditional Independence Graphs: On Subadditivity of
Probability Divergences [70.30467057209405]
Generative Adversarial Networks (GANs) are modern methods to learn the underlying distribution of a data set.
GANs are designed in a model-free fashion where no additional information about the underlying distribution is available.
We propose a principled design of a model-based GAN that uses a set of simple discriminators on the neighborhoods of the Bayes-net/MRF.
arXiv Detail & Related papers (2020-03-02T04:31:22Z) - Stein Variational Inference for Discrete Distributions [70.19352762933259]
We propose a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions.
Our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo.
We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN)
In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions.
arXiv Detail & Related papers (2020-03-01T22:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.