Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein
Distance)
- URL: http://arxiv.org/abs/2103.01678v2
- Date: Wed, 3 Mar 2021 16:50:36 GMT
- Title: Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein
Distance)
- Authors: Jan Stanczuk, Christian Etmann, Lisa Maria Kreusser, Carola-Bibiane
Schonlieb
- Abstract summary: Wasserstein GANs are based on the idea of minimising the Wasserstein distance between a real and a generated distribution.
We provide an in-depth mathematical analysis of differences between the theoretical setup and the reality of training Wasserstein GANs.
- Score: 1.1470070927586016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Wasserstein GANs are based on the idea of minimising the Wasserstein distance
between a real and a generated distribution. We provide an in-depth
mathematical analysis of differences between the theoretical setup and the
reality of training Wasserstein GANs. In this work, we gather both theoretical
and empirical evidence that the WGAN loss is not a meaningful approximation of
the Wasserstein distance. Moreover, we argue that the Wasserstein distance is
not even a desirable loss function for deep generative models, and conclude
that the success of Wasserstein GANs can in truth be attributed to a failure to
approximate the Wasserstein distance.
Related papers
- Conditional Wasserstein Distances with Applications in Bayesian OT Flow Matching [1.609940380983903]
In inverse problems, many conditional generative models approximate the posterior measure by minimizing a distance between the joint measure and its learned approximation.
We introduce a conditional Wasserstein distance via a set of restricted couplings that equals the expected Wasserstein distance of the posteriors.
We derive theoretical properties of the conditional Wasserstein distance, characterize the corresponding geodesics and velocity fields as well as the flow ODEs.
arXiv Detail & Related papers (2024-03-27T15:54:55Z) - A Wasserstein perspective of Vanilla GANs [0.0]
Vanilla GANs are generalizations of Wasserstein GANs.
In particular, we obtain an oracle inequality for Vanilla GANs in Wasserstein distance.
We conclude a rate of convergence for Vanilla GANs as well as Wasserstein GANs as estimators of the unknown probability distribution.
arXiv Detail & Related papers (2024-03-22T16:04:26Z) - Squared Wasserstein-2 Distance for Efficient Reconstruction of
Stochastic Differential Equations [0.0]
We provide an analysis of the squared $W$ distance between two probability distributions associated with Wasserstein differential equations (SDEs)
Based on this analysis, we propose the use of a squared $W$ distance-based loss functions in the textitreconstruction of SDEs from noisy data.
arXiv Detail & Related papers (2024-01-21T00:54:50Z) - Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching [111.78179839856293]
We propose Primal Wasserstein DICE to minimize the primal Wasserstein distance between the learner and expert state occupancies.
Our framework is a generalization of SMODICE, and is the first work that unifies $f$-divergence and Wasserstein minimization.
arXiv Detail & Related papers (2023-11-02T15:41:57Z) - Y-Diagonal Couplings: Approximating Posteriors with Conditional
Wasserstein Distances [0.4419843514606336]
In inverse problems, many conditional generative models approximate the posterior measure by minimizing a distance between the joint measure and its learned approximation.
We will introduce a conditional Wasserstein distance with a set of restricted couplings that equals the expected Wasserstein distance of the posteriors.
arXiv Detail & Related papers (2023-10-20T11:46:05Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - Towards Generalized Implementation of Wasserstein Distance in GANs [46.79148259312607]
Wasserstein GANs (WGANs) built upon the Kantorovich-Rubinstein duality of Wasserstein distance.
In practice it does not always outperform other variants of GANs.
We propose a general WGAN training scheme named Sobolev Wasserstein GAN (SWGAN)
arXiv Detail & Related papers (2020-12-07T02:22:23Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - When OT meets MoM: Robust estimation of Wasserstein Distance [8.812837829361923]
We consider the problem of estimating the Wasserstein distance between two probability distributions when observations are polluted by outliers.
We introduce and discuss novel MoM-based robust estimators whose consistency is studied under a data contamination model.
We propose a simple MoM-based re-weighting scheme that could be used in conjunction with the Sinkhorn algorithm.
arXiv Detail & Related papers (2020-06-18T07:31:39Z) - Augmented Sliced Wasserstein Distances [55.028065567756066]
We propose a new family of distance metrics, called augmented sliced Wasserstein distances (ASWDs)
ASWDs are constructed by first mapping samples to higher-dimensional hypersurfaces parameterized by neural networks.
Numerical results demonstrate that the ASWD significantly outperforms other Wasserstein variants for both synthetic and real-world problems.
arXiv Detail & Related papers (2020-06-15T23:00:08Z) - Projection Robust Wasserstein Distance and Riemannian Optimization [107.93250306339694]
We show that projection robustly solidstein (PRW) is a robust variant of Wasserstein projection (WPP)
This paper provides a first step into the computation of the PRW distance and provides the links between their theory and experiments on and real data.
arXiv Detail & Related papers (2020-06-12T20:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.