On the Existence of Optimal Transport Gradient for Learning Generative
Models
- URL: http://arxiv.org/abs/2102.05542v1
- Date: Wed, 10 Feb 2021 16:28:20 GMT
- Title: On the Existence of Optimal Transport Gradient for Learning Generative
Models
- Authors: Antoine Houdard and Arthur Leclaire and Nicolas Papadakis and Julien
Rabin
- Abstract summary: Training of Wasserstein Generative Adversarial Networks (WGAN) relies on the calculation of the gradient of the optimal transport cost.
We first demonstrate that such gradient may not be defined, which can result in numerical instabilities during gradient-based optimization.
By exploiting the discrete nature of empirical data, we formulate the gradient in a semi-discrete setting and propose an algorithm for the optimization of the generative model parameters.
- Score: 8.602553195689513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of optimal transport cost for learning generative models has become
popular with Wasserstein Generative Adversarial Networks (WGAN). Training of
WGAN relies on a theoretical background: the calculation of the gradient of the
optimal transport cost with respect to the generative model parameters. We
first demonstrate that such gradient may not be defined, which can result in
numerical instabilities during gradient-based optimization. We address this
issue by stating a valid differentiation theorem in the case of entropic
regularized transport and specify conditions under which existence is ensured.
By exploiting the discrete nature of empirical data, we formulate the gradient
in a semi-discrete setting and propose an algorithm for the optimization of the
generative model parameters. Finally, we illustrate numerically the advantage
of the proposed framework.
Related papers
- Variational Entropic Optimal Transport [67.76725267984578]
We propose Variational Entropic Optimal Transport (VarEOT) for domain translation problems.<n>VarEOT is based on an exact variational reformulation of the log-partition $log mathbbE[exp(cdot)$ as a tractable generalization over an auxiliary positive normalizer.<n> Experiments on synthetic data and unpaired image-to-image translation demonstrate competitive or improved translation quality.
arXiv Detail & Related papers (2026-02-02T15:48:44Z) - Worst-case generation via minimax optimization in Wasserstein space [19.645939141861543]
Worst-case generation plays a critical role in evaluating robustness and stress-testing systems under distribution shifts.<n>We develop a generative modeling framework for worst-case generation for a pre-specified risk.
arXiv Detail & Related papers (2025-12-09T02:11:08Z) - On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization [57.179679246370114]
A potential limitation of existing methods is the bias inherent in most perturbation estimators unless a stepsize is proposed.<n>We propose a novel family of unbiased gradient scaling estimators that eliminate bias while maintaining favorable construction.
arXiv Detail & Related papers (2025-10-22T18:25:43Z) - Neural Optimal Transport Meets Multivariate Conformal Prediction [58.43397908730771]
We propose a framework for conditional vectorile regression (CVQR)<n>CVQR combines neural optimal transport with quantized optimization, and apply it to predictions.
arXiv Detail & Related papers (2025-09-29T19:50:19Z) - Representation-Aware Distributionally Robust Optimization: A Knowledge Transfer Framework [6.529107536201152]
We propose a novel framework for Wasserstein distributionally robust learning that accounts for predictive representations when guarding against distributional shifts.<n>We show that READ embeds a multidimensional alignment parameter into the transport cost, allowing the model to differentially discourage perturbations along directions associated with informative representations.<n>We conclude by demonstrating the effectiveness of our framework through extensive simulations and a real-world study.
arXiv Detail & Related papers (2025-09-11T11:42:17Z) - Flows and Diffusions on the Neural Manifold [0.0]
Diffusion and flow-based generative models have achieved remarkable success in domains such as image synthesis, video generation, and natural language modeling.<n>We extend these advances to weight space learning by leveraging recent techniques to incorporate structural priors derived from optimization dynamics.
arXiv Detail & Related papers (2025-07-14T02:26:06Z) - Neural Conditional Transport Maps [0.0]
We introduce a conditioning mechanism capable of processing both categorical and continuous conditioning variables simultaneously.<n>At the core of our method lies a hypernetwork that generates transport layer parameters based on these inputs, creating adaptive mappings.<n>This work advances the state-of-the-art in conditional optimal transport, enabling broader application of optimal transport principles to complex, high-dimensional domains.
arXiv Detail & Related papers (2025-05-21T17:59:02Z) - Proximal optimal transport divergences [6.6875717609310765]
We introduce the proximal optimal transport divergence, a novel discrepancy measure that interpolates between information divergences and optimal transport distances via an infimal convolution formulation.<n>This divergence provides a principled foundation for optimal transport proximals and proximal optimization methods frequently used in generative modeling.<n>We explore its mathematical properties, including smoothness, boundedness, and computational tractability, and establish connections to primal-dual formulations and adversarial learning.
arXiv Detail & Related papers (2025-05-17T17:48:11Z) - Dynamical Measure Transport and Neural PDE Solvers for Sampling [77.38204731939273]
We tackle the task of sampling from a probability density as transporting a tractable density function to the target.
We employ physics-informed neural networks (PINNs) to approximate the respective partial differential equations (PDEs) solutions.
PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently.
arXiv Detail & Related papers (2024-07-10T17:39:50Z) - Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models [54.132297393662654]
We introduce a hybrid method that fine-tunes cutting-edge diffusion models by optimizing reward models through RL.
We demonstrate the capability of our approach to outperform the best designs in offline data, leveraging the extrapolation capabilities of reward models.
arXiv Detail & Related papers (2024-05-30T03:57:29Z) - Flow-based Distributionally Robust Optimization [23.232731771848883]
We present a framework, called $textttFlowDRO$, for solving flow-based distributionally robust optimization (DRO) problems with Wasserstein uncertainty sets.
We aim to find continuous worst-case distribution (also called the Least Favorable Distribution, LFD) and sample from it.
We demonstrate its usage in adversarial learning, distributionally robust hypothesis testing, and a new mechanism for data-driven distribution perturbation differential privacy.
arXiv Detail & Related papers (2023-10-30T03:53:31Z) - A transport approach to sequential simulation-based inference [0.0]
We present a new transport-based approach to efficiently perform sequential Bayesian inference of static model parameters.
The strategy is based on the extraction of conditional distribution from the joint distribution of parameters and data, via the estimation of structured (e.g., block triangular) transport maps.
This allow gradient-based characterizations of posterior density via transport maps in a model-free, online phase.
arXiv Detail & Related papers (2023-08-26T18:53:48Z) - Variational Sequential Optimal Experimental Design using Reinforcement
Learning [0.0]
We introduce variational sequential Optimal Experimental Design (vsOED), a new method for optimally designing a finite sequence of experiments under a Bayesian framework and with information-gain utilities.
Our vsOED results indicate substantially improved sample efficiency and reduced number of forward model simulations compared to previous sequential design algorithms.
arXiv Detail & Related papers (2023-06-17T21:47:19Z) - Differentiable Agent-Based Simulation for Gradient-Guided
Simulation-Based Optimization [0.0]
gradient estimation methods can be used to steer the optimization towards a local optimum.
In traffic signal timing optimization problems with high input dimension, the gradient-based methods exhibit substantially superior performance.
arXiv Detail & Related papers (2021-03-23T11:58:21Z) - Comparing Probability Distributions with Conditional Transport [63.11403041984197]
We propose conditional transport (CT) as a new divergence and approximate it with the amortized CT (ACT) cost.
ACT amortizes the computation of its conditional transport plans and comes with unbiased sample gradients that are straightforward to compute.
On a wide variety of benchmark datasets generative modeling, substituting the default statistical distance of an existing generative adversarial network with ACT is shown to consistently improve the performance.
arXiv Detail & Related papers (2020-12-28T05:14:22Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - SODEN: A Scalable Continuous-Time Survival Model through Ordinary
Differential Equation Networks [14.564168076456822]
We propose a flexible model for survival analysis using neural networks along with scalable optimization algorithms.
We demonstrate the effectiveness of the proposed method in comparison to existing state-of-the-art deep learning survival analysis models.
arXiv Detail & Related papers (2020-08-19T19:11:25Z) - Statistical Optimal Transport posed as Learning Kernel Embedding [0.0]
This work takes the novel approach of posing statistical Optimal Transport (OT) as that of learning the transport plan's kernel mean embedding from sample based estimates of marginal embeddings.
A key result is that, under very mild conditions, $epsilon$-optimal recovery of the transport plan as well as the Barycentric-projection based transport map is possible with a sample complexity that is completely dimension-free.
arXiv Detail & Related papers (2020-02-08T14:58:53Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.