Variational Schrödinger Diffusion Models
- URL: http://arxiv.org/abs/2405.04795v4
- Date: Fri, 25 Oct 2024 10:35:33 GMT
- Title: Variational Schrödinger Diffusion Models
- Authors: Wei Deng, Weijian Luo, Yixin Tan, Marin Biloš, Yu Chen, Yuriy Nevmyvaka, Ricky T. Q. Chen,
- Abstract summary: Schr"odinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models.
We leverage variational inference to linearize the forward score functions (variational scores) of SB.
We propose the variational Schr"odinger diffusion model (VSDM), where the forward process is a multivariate diffusion and the variational scores are adaptively optimized for efficient transport.
- Score: 14.480273869571468
- License:
- Abstract: Schr\"odinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize the forward score functions (variational scores) of SB and restore simulation-free properties in training backward scores. We propose the variational Schr\"odinger diffusion model (VSDM), where the forward process is a multivariate diffusion and the variational scores are adaptively optimized for efficient transport. Theoretically, we use stochastic approximation to prove the convergence of the variational scores and show the convergence of the adaptively generated samples based on the optimal variational scores. Empirically, we test the algorithm in simulated examples and observe that VSDM is efficient in generations of anisotropic shapes and yields straighter sample trajectories compared to the single-variate diffusion. We also verify the scalability of the algorithm in real-world data and achieve competitive unconditional generation performance in CIFAR10 and conditional generation in time series modeling. Notably, VSDM no longer depends on warm-up initializations and has become tuning-friendly in training large-scale experiments.
Related papers
- Variational Schrödinger Momentum Diffusion [15.074672636555755]
We introduce variational Schr"odinger momentum diffusion (VSMD) to eliminate the dependence on simulated forward trajectories.
Our approach scales effectively to real-world data, achieving competitive results in time series and image generation.
arXiv Detail & Related papers (2025-01-28T03:19:58Z) - Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling [22.256068524699472]
In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues.
We combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution.
Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
arXiv Detail & Related papers (2024-08-13T08:09:05Z) - On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution.
In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z) - Uncertainty-aware Surrogate Models for Airfoil Flow Simulations with Denoising Diffusion Probabilistic Models [26.178192913986344]
We make a first attempt to use denoising diffusion probabilistic models (DDPMs) to train an uncertainty-aware surrogate model for turbulence simulations.
Our results show DDPMs can successfully capture the whole distribution of solutions and, as a consequence, accurately estimate the uncertainty of the simulations.
We also evaluate an emerging generative modeling variant, flow matching, in comparison to regular diffusion models.
arXiv Detail & Related papers (2023-12-08T19:04:17Z) - Neural Diffusion Models [2.1779479916071067]
We present a generalization of conventional diffusion models that enables defining and learning time-dependent non-linear transformations of data.
NDMs outperform conventional diffusion models in terms of likelihood and produce high-quality samples.
arXiv Detail & Related papers (2023-10-12T13:54:55Z) - A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE.
We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z) - Diffusion Glancing Transformer for Parallel Sequence to Sequence
Learning [52.72369034247396]
We propose the diffusion glancing transformer, which employs a modality diffusion process and residual glancing sampling.
DIFFGLAT achieves better generation accuracy while maintaining fast decoding speed compared with both autoregressive and non-autoregressive models.
arXiv Detail & Related papers (2022-12-20T13:36:25Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Comparing Probability Distributions with Conditional Transport [63.11403041984197]
We propose conditional transport (CT) as a new divergence and approximate it with the amortized CT (ACT) cost.
ACT amortizes the computation of its conditional transport plans and comes with unbiased sample gradients that are straightforward to compute.
On a wide variety of benchmark datasets generative modeling, substituting the default statistical distance of an existing generative adversarial network with ACT is shown to consistently improve the performance.
arXiv Detail & Related papers (2020-12-28T05:14:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.