Related papers: Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport

Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport

URL: http://arxiv.org/abs/2402.05443v3
Date: Mon, 3 Jun 2024 08:12:13 GMT
Title: Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport
Authors: Jaemoo Choi, Jaewoong Choi, Myungjoo Kang,
Abstract summary: We introduce a scalable WGF-based generative model, called Semi-dual JKO (S-JKO) Our model is based on the semi-dual form of the JKO step, derived from the equivalence between the JKO step and the Unbalanced Optimal Transport. We demonstrate that our model significantly outperforms existing WGF-based generative models.
Score: 8.880526853373357
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Wasserstein Gradient Flow (WGF) describes the gradient dynamics of probability density within the Wasserstein space. WGF provides a promising approach for conducting optimization over the probability distributions. Numerically approximating the continuous WGF requires the time discretization method. The most well-known method for this is the JKO scheme. In this regard, previous WGF models employ the JKO scheme and parametrize transport map for each JKO step. However, this approach results in quadratic training complexity $O(K^2)$ with the number of JKO step $K$. This severely limits the scalability of WGF models. In this paper, we introduce a scalable WGF-based generative model, called Semi-dual JKO (S-JKO). Our model is based on the semi-dual form of the JKO step, derived from the equivalence between the JKO step and the Unbalanced Optimal Transport. Our approach reduces the training complexity to $O(K)$. We demonstrate that our model significantly outperforms existing WGF-based generative models, achieving FID scores of 2.62 on CIFAR-10 and 5.46 on CelebA-HQ-256, which are comparable to state-of-the-art image generative models.

Related papers

Learning Latent Variable Models via Jarzynski-adjusted Langevin Algorithm [0.0]
We utilise a sampler originating from nonequilibrium statistical mechanics to build estimation methods in latent variable models.<n>We develop a sequential Monte Carlo (SMC) method that provides the maximum marginal likelihood estimate of the parameters.<n>We demonstrate the performance of JALA-EM on a variety of latent variable models and show that it performs comparably to existing methods in terms of accuracy and computational efficiency.
arXiv Detail & Related papers (2025-05-23T23:40:57Z)
Nested Annealed Training Scheme for Generative Adversarial Networks [54.70743279423088]
This paper focuses on a rigorous mathematical theoretical framework: the composite-functional-gradient GAN (CFG) We reveal the theoretical connection between the CFG model and score-based models. We find that the training objective of the CFG discriminator is equivalent to finding an optimal D(x)
arXiv Detail & Related papers (2025-01-20T07:44:09Z)
JKO for Landau: a variational particle method for homogeneous Landau equation [7.600098227248821]
We develop a novel implicit particle method for the Landau equation in the framework of the JKO scheme. A key observation is that while the flow map evolves according to a rather complicated integral equation, the unknown component is merely a score function of the corresponding density.
arXiv Detail & Related papers (2024-09-18T20:08:19Z)
Flow map matching with stochastic interpolants: A mathematical framework for consistency models [15.520853806024943]
Flow Map Matching is a principled framework for learning the two-time flow map of an underlying generative model.<n>We show that FMM unifies and extends a broad class of existing approaches for fast sampling.
arXiv Detail & Related papers (2024-06-11T17:41:26Z)
Closed-form Filtering for Non-linear Systems [83.91296397912218]
We propose a new class of filters based on Gaussian PSD Models, which offer several advantages in terms of density approximation and computational efficiency. We show that filtering can be efficiently performed in closed form when transitions and observations are Gaussian PSD Models. Our proposed estimator enjoys strong theoretical guarantees, with estimation error that depends on the quality of the approximation and is adaptive to the regularity of the transition probabilities.
arXiv Detail & Related papers (2024-02-15T08:51:49Z)
Convergence of flow-based generative models via proximal gradient descent in Wasserstein space [20.771897445580723]
Flow-based generative models enjoy certain advantages in computing the data generation and the likelihood. We provide a theoretical guarantee of generating data distribution by a progressive flow model.
arXiv Detail & Related papers (2023-10-26T17:06:23Z)
Generative Modelling of L\'{e}vy Area for High Order SDE Simulation [5.9535699822923]
L'evyGAN is a deep-learning model for generating approximate samples of L'evy area conditional on a Brownian increment. We show that L'evyGAN exhibits state-of-the-art performance across several metrics which measure both the joint and marginal distributions.
arXiv Detail & Related papers (2023-08-04T16:38:37Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport [9.980822222343921]
We propose a novel generative model based on the semi-dual formulation of Unbalanced Optimal Transport (UOT) Unlike OT, UOT relaxes the hard constraint on distribution matching. This approach provides better robustness against outliers, stability during training, and faster convergence. Our model outperforms existing OT-based generative models, achieving FID scores of 2.97 on CIFAR-10 and 6.36 on CelebA-HQ-256.
arXiv Detail & Related papers (2023-05-24T06:31:05Z)
Sliced-Wasserstein Gradient Flows [15.048733056992855]
Minimizing functionals in the space of probability distributions can be done with Wasserstein gradient flows. This work proposes to use gradient flows in the space of probability measures endowed with the sliced-Wasserstein distance.
arXiv Detail & Related papers (2021-10-21T08:34:26Z)
Large-Scale Wasserstein Gradient Flows [84.73670288608025]
We introduce a scalable scheme to approximate Wasserstein gradient flows. Our approach relies on input neural networks (ICNNs) to discretize the JKO steps. As a result, we can sample from the measure at each step of the gradient diffusion and compute its density.
arXiv Detail & Related papers (2021-06-01T19:21:48Z)
Probabilistic Circuits for Variational Inference in Discrete Graphical Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult. Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO) We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN) We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z)
Denoising Diffusion Probabilistic Models [91.94962645056896]
We present high quality image synthesis results using diffusion probabilistic models. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics.
arXiv Detail & Related papers (2020-06-19T17:24:44Z)
A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs) We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation. Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.