Sliced-Wasserstein Gradient Flows
- URL: http://arxiv.org/abs/2110.10972v1
- Date: Thu, 21 Oct 2021 08:34:26 GMT
- Title: Sliced-Wasserstein Gradient Flows
- Authors: Cl\'ement Bonet, Nicolas Courty, Fran\c{c}ois Septier, Lucas Drumetz
- Abstract summary: Minimizing functionals in the space of probability distributions can be done with Wasserstein gradient flows.
This work proposes to use gradient flows in the space of probability measures endowed with the sliced-Wasserstein distance.
- Score: 15.048733056992855
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Minimizing functionals in the space of probability distributions can be done
with Wasserstein gradient flows. To solve them numerically, a possible approach
is to rely on the Jordan-Kinderlehrer-Otto (JKO) scheme which is analogous to
the proximal scheme in Euclidean spaces. However, this bilevel optimization
problem is known for its computational challenges, especially in high
dimension. To alleviate it, very recent works propose to approximate the JKO
scheme leveraging Brenier's theorem, and using gradients of Input Convex Neural
Networks to parameterize the density (JKO-ICNN). However, this method comes
with a high computational cost and stability issues. Instead, this work
proposes to use gradient flows in the space of probability measures endowed
with the sliced-Wasserstein (SW) distance. We argue that this method is more
flexible than JKO-ICNN, since SW enjoys a closed-form differentiable
approximation. Thus, the density at each step can be parameterized by any
generative model which alleviates the computational burden and makes it
tractable in higher dimensions. Interestingly, we also show empirically that
these gradient flows are strongly related to the usual Wasserstein gradient
flows, and that they can be used to minimize efficiently diverse machine
learning functionals.
Related papers
- Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels [78.6096486885658]
We introduce lower bounds to the linearized Laplace approximation of the marginal likelihood.
These bounds are amenable togradient-based optimization and allow to trade off estimation accuracy against computational complexity.
arXiv Detail & Related papers (2023-06-06T19:02:57Z) - D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory [79.50644650795012]
We propose a deep learning approach to solve Kohn-Sham Density Functional Theory (KS-DFT)
We prove that such an approach has the same expressivity as the SCF method, yet reduces the computational complexity.
In addition, we show that our approach enables us to explore more complex neural-based wave functions.
arXiv Detail & Related papers (2023-03-01T10:38:10Z) - Deep Equilibrium Optical Flow Estimation [80.80992684796566]
Recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms.
These RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation.
We propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer.
arXiv Detail & Related papers (2022-04-18T17:53:44Z) - Variational Wasserstein gradient flow [9.901677207027806]
We propose a scalable proximal gradient type algorithm for Wasserstein gradient flow.
Our framework covers all the classical Wasserstein gradient flows including the heat equation and the porous medium equation.
arXiv Detail & Related papers (2021-12-04T20:27:31Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Optimizing Functionals on the Space of Probabilities with Input Convex
Neural Networks [32.29616488152138]
A typical approach to solving this problem relies on its connection to the dynamic Jordan-Kinderlehrer-Otto scheme.
We propose an approach that relies on the recently introduced input- neural networks (ICNN) to parameterize the space of convex functions in order to approximate the JKO scheme.
arXiv Detail & Related papers (2021-06-01T20:13:18Z) - Large-Scale Wasserstein Gradient Flows [84.73670288608025]
We introduce a scalable scheme to approximate Wasserstein gradient flows.
Our approach relies on input neural networks (ICNNs) to discretize the JKO steps.
As a result, we can sample from the measure at each step of the gradient diffusion and compute its density.
arXiv Detail & Related papers (2021-06-01T19:21:48Z) - The Wasserstein Proximal Gradient Algorithm [23.143814848127295]
Wasserstein gradient flows are continuous time dynamics that define curves of steepest descent to minimize an objective function over the space of probability measures.
We propose a Forward Backward (FB) discretization scheme that can tackle the case where the objective function is the sum of a smooth and a nonsmooth geodesically convex terms.
arXiv Detail & Related papers (2020-02-07T22:19:32Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.