Unified Discrete Diffusion for Categorical Data
- URL: http://arxiv.org/abs/2402.03701v2
- Date: Mon, 12 Aug 2024 16:22:38 GMT
- Title: Unified Discrete Diffusion for Categorical Data
- Authors: Lingxiao Zhao, Xueying Ding, Lijun Yu, Leman Akoglu,
- Abstract summary: We present a series of mathematical simplifications of the variational lower bound that enable more accurate and easy-to-optimize training for discrete diffusion.
We derive a simple formulation for backward denoising that enables exact and accelerated sampling, and importantly, an elegant unification of discrete-time and continuous-time discrete diffusion.
- Score: 37.56355078250024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Discrete diffusion models have seen a surge of attention with applications on naturally discrete data such as language and graphs. Although discrete-time discrete diffusion has been established for a while, only recently Campbell et al. (2022) introduced the first framework for continuous-time discrete diffusion. However, their training and sampling processes differ significantly from the discrete-time version, necessitating nontrivial approximations for tractability. In this paper, we first present a series of mathematical simplifications of the variational lower bound that enable more accurate and easy-to-optimize training for discrete diffusion. In addition, we derive a simple formulation for backward denoising that enables exact and accelerated sampling, and importantly, an elegant unification of discrete-time and continuous-time discrete diffusion. Thanks to simpler analytical formulations, both forward and now also backward probabilities can flexibly accommodate any noise distribution, including different noise distributions for multi-element objects. Experiments show that our proposed USD3 (for Unified Simplified Discrete Denoising Diffusion) outperform all SOTA baselines on established datasets. We open-source our unified code at https://github.com/LingxiaoShawn/USD3.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Multi-scale Diffusion Denoised Smoothing [79.95360025953931]
randomized smoothing has become one of a few tangible approaches that offers adversarial robustness to models at scale.
We present scalable methods to address the current trade-off between certified robustness and accuracy in denoised smoothing.
Our experiments show that the proposed multi-scale smoothing scheme combined with diffusion fine-tuning enables strong certified robustness available with high noise level.
arXiv Detail & Related papers (2023-10-25T17:11:21Z) - Diffusion on the Probability Simplex [24.115365081118604]
Diffusion models learn to reverse the progressive noising of a data distribution to create a generative model.
We propose a method of performing diffusion on the probability simplex.
We find that our methodology naturally extends to include diffusion on the unit cube which has applications for bounded image generation.
arXiv Detail & Related papers (2023-09-05T18:52:35Z) - Bayesian Flow Networks [4.585102332532472]
This paper introduces Bayesian Flow Networks (BFNs), a new class of generative model in which the parameters of a set of independent distributions are modified with Bayesian inference.
Starting from a simple prior and iteratively updating the two distributions yields a generative procedure similar to the reverse process of diffusion models.
BFNs achieve competitive log-likelihoods for image modelling on dynamically binarized MNIST and CIFAR-10, and outperform all known discrete diffusion models on the text8 character-level language modelling task.
arXiv Detail & Related papers (2023-08-14T09:56:35Z) - Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces [0.0]
We develop a theoretical formulation for arbitrary discrete-state Markov processes in the forward diffusion process.
As an example, we introduce Blackout Diffusion'', which learns to produce samples from an empty image instead of from noise.
arXiv Detail & Related papers (2023-05-18T16:24:12Z) - Where to Diffuse, How to Diffuse, and How to Get Back: Automated
Learning for Multivariate Diffusions [22.04182099405728]
Diffusion-based generative models (DBGMs) perturb data to a target noise distribution and reverse this inference diffusion process to generate samples.
We show how to maximize a lower-bound on the likelihood for any number of auxiliary variables.
We then demonstrate how to parameterize the diffusion for a specified target noise distribution.
arXiv Detail & Related papers (2023-02-14T18:57:04Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Diffusion-GAN: Training GANs with Diffusion [135.24433011977874]
Generative adversarial networks (GANs) are challenging to train stably.
We propose Diffusion-GAN, a novel GAN framework that leverages a forward diffusion chain to generate instance noise.
We show that Diffusion-GAN can produce more realistic images with higher stability and data efficiency than state-of-the-art GANs.
arXiv Detail & Related papers (2022-06-05T20:45:01Z) - Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial
Auto-Encoders [137.1060633388405]
Diffusion-based generative models learn how to generate the data by inferring a reverse diffusion chain.
We propose a faster and cheaper approach that adds noise not until the data become pure random noise.
We show that the proposed model can be cast as an adversarial auto-encoder empowered by both the diffusion process and a learnable implicit prior.
arXiv Detail & Related papers (2022-02-19T20:18:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.