On the Existence of Optimal Transport Gradient for Learning Generative
Models
- URL: http://arxiv.org/abs/2102.05542v1
- Date: Wed, 10 Feb 2021 16:28:20 GMT
- Title: On the Existence of Optimal Transport Gradient for Learning Generative
Models
- Authors: Antoine Houdard and Arthur Leclaire and Nicolas Papadakis and Julien
Rabin
- Abstract summary: Training of Wasserstein Generative Adversarial Networks (WGAN) relies on the calculation of the gradient of the optimal transport cost.
We first demonstrate that such gradient may not be defined, which can result in numerical instabilities during gradient-based optimization.
By exploiting the discrete nature of empirical data, we formulate the gradient in a semi-discrete setting and propose an algorithm for the optimization of the generative model parameters.
- Score: 8.602553195689513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of optimal transport cost for learning generative models has become
popular with Wasserstein Generative Adversarial Networks (WGAN). Training of
WGAN relies on a theoretical background: the calculation of the gradient of the
optimal transport cost with respect to the generative model parameters. We
first demonstrate that such gradient may not be defined, which can result in
numerical instabilities during gradient-based optimization. We address this
issue by stating a valid differentiation theorem in the case of entropic
regularized transport and specify conditions under which existence is ensured.
By exploiting the discrete nature of empirical data, we formulate the gradient
in a semi-discrete setting and propose an algorithm for the optimization of the
generative model parameters. Finally, we illustrate numerically the advantage
of the proposed framework.
Related papers
- Dynamical Measure Transport and Neural PDE Solvers for Sampling [77.38204731939273]
We tackle the task of sampling from a probability density as transporting a tractable density function to the target.
We employ physics-informed neural networks (PINNs) to approximate the respective partial differential equations (PDEs) solutions.
PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently.
arXiv Detail & Related papers (2024-07-10T17:39:50Z) - Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models [54.132297393662654]
We introduce a hybrid method that fine-tunes cutting-edge diffusion models by optimizing reward models through RL.
We demonstrate the capability of our approach to outperform the best designs in offline data, leveraging the extrapolation capabilities of reward models.
arXiv Detail & Related papers (2024-05-30T03:57:29Z) - Flow-based Distributionally Robust Optimization [23.232731771848883]
We present a framework, called $textttFlowDRO$, for solving flow-based distributionally robust optimization (DRO) problems with Wasserstein uncertainty sets.
We aim to find continuous worst-case distribution (also called the Least Favorable Distribution, LFD) and sample from it.
We demonstrate its usage in adversarial learning, distributionally robust hypothesis testing, and a new mechanism for data-driven distribution perturbation differential privacy.
arXiv Detail & Related papers (2023-10-30T03:53:31Z) - A transport approach to sequential simulation-based inference [0.0]
We present a new transport-based approach to efficiently perform sequential Bayesian inference of static model parameters.
The strategy is based on the extraction of conditional distribution from the joint distribution of parameters and data, via the estimation of structured (e.g., block triangular) transport maps.
This allow gradient-based characterizations of posterior density via transport maps in a model-free, online phase.
arXiv Detail & Related papers (2023-08-26T18:53:48Z) - Variational Sequential Optimal Experimental Design using Reinforcement
Learning [0.0]
We introduce variational sequential Optimal Experimental Design (vsOED), a new method for optimally designing a finite sequence of experiments under a Bayesian framework and with information-gain utilities.
Our vsOED results indicate substantially improved sample efficiency and reduced number of forward model simulations compared to previous sequential design algorithms.
arXiv Detail & Related papers (2023-06-17T21:47:19Z) - Comparing Probability Distributions with Conditional Transport [63.11403041984197]
We propose conditional transport (CT) as a new divergence and approximate it with the amortized CT (ACT) cost.
ACT amortizes the computation of its conditional transport plans and comes with unbiased sample gradients that are straightforward to compute.
On a wide variety of benchmark datasets generative modeling, substituting the default statistical distance of an existing generative adversarial network with ACT is shown to consistently improve the performance.
arXiv Detail & Related papers (2020-12-28T05:14:22Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - SODEN: A Scalable Continuous-Time Survival Model through Ordinary
Differential Equation Networks [14.564168076456822]
We propose a flexible model for survival analysis using neural networks along with scalable optimization algorithms.
We demonstrate the effectiveness of the proposed method in comparison to existing state-of-the-art deep learning survival analysis models.
arXiv Detail & Related papers (2020-08-19T19:11:25Z) - Statistical Optimal Transport posed as Learning Kernel Embedding [0.0]
This work takes the novel approach of posing statistical Optimal Transport (OT) as that of learning the transport plan's kernel mean embedding from sample based estimates of marginal embeddings.
A key result is that, under very mild conditions, $epsilon$-optimal recovery of the transport plan as well as the Barycentric-projection based transport map is possible with a sample complexity that is completely dimension-free.
arXiv Detail & Related papers (2020-02-08T14:58:53Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.