Towards Generalized Implementation of Wasserstein Distance in GANs
- URL: http://arxiv.org/abs/2012.03420v2
- Date: Tue, 12 Jan 2021 11:30:57 GMT
- Title: Towards Generalized Implementation of Wasserstein Distance in GANs
- Authors: Minkai Xu, Zhiming Zhou, Guansong Lu, Jian Tang, Weinan Zhang, Yong Yu
- Abstract summary: Wasserstein GANs (WGANs) built upon the Kantorovich-Rubinstein duality of Wasserstein distance.
In practice it does not always outperform other variants of GANs.
We propose a general WGAN training scheme named Sobolev Wasserstein GAN (SWGAN)
- Score: 46.79148259312607
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Wasserstein GANs (WGANs), built upon the Kantorovich-Rubinstein (KR) duality
of Wasserstein distance, is one of the most theoretically sound GAN models.
However, in practice it does not always outperform other variants of GANs. This
is mostly due to the imperfect implementation of the Lipschitz condition
required by the KR duality. Extensive work has been done in the community with
different implementations of the Lipschitz constraint, which, however, is still
hard to satisfy the restriction perfectly in practice. In this paper, we argue
that the strong Lipschitz constraint might be unnecessary for optimization.
Instead, we take a step back and try to relax the Lipschitz constraint.
Theoretically, we first demonstrate a more general dual form of the Wasserstein
distance called the Sobolev duality, which relaxes the Lipschitz constraint but
still maintains the favorable gradient property of the Wasserstein distance.
Moreover, we show that the KR duality is actually a special case of the Sobolev
duality. Based on the relaxed duality, we further propose a generalized WGAN
training scheme named Sobolev Wasserstein GAN (SWGAN), and empirically
demonstrate the improvement of SWGAN over existing methods with extensive
experiments.
Related papers
- Conditional Wasserstein Distances with Applications in Bayesian OT Flow Matching [1.609940380983903]
In inverse problems, many conditional generative models approximate the posterior measure by minimizing a distance between the joint measure and its learned approximation.
We introduce a conditional Wasserstein distance via a set of restricted couplings that equals the expected Wasserstein distance of the posteriors.
We derive theoretical properties of the conditional Wasserstein distance, characterize the corresponding geodesics and velocity fields as well as the flow ODEs.
arXiv Detail & Related papers (2024-03-27T15:54:55Z) - Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching [111.78179839856293]
We propose Primal Wasserstein DICE to minimize the primal Wasserstein distance between the learner and expert state occupancies.
Our framework is a generalization of SMODICE, and is the first work that unifies $f$-divergence and Wasserstein minimization.
arXiv Detail & Related papers (2023-11-02T15:41:57Z) - Y-Diagonal Couplings: Approximating Posteriors with Conditional
Wasserstein Distances [0.4419843514606336]
In inverse problems, many conditional generative models approximate the posterior measure by minimizing a distance between the joint measure and its learned approximation.
We will introduce a conditional Wasserstein distance with a set of restricted couplings that equals the expected Wasserstein distance of the posteriors.
arXiv Detail & Related papers (2023-10-20T11:46:05Z) - PAC-Bayesian Generalization Bounds for Adversarial Generative Models [2.828173677501078]
We develop generalization bounds for models based on the Wasserstein distance and the total variation distance.
Our results naturally apply to Wasserstein GANs and Energy-Based GANs, and our bounds provide new training objectives for these two.
arXiv Detail & Related papers (2023-02-17T15:25:49Z) - Chordal Sparsity for Lipschitz Constant Estimation of Deep Neural
Networks [77.82638674792292]
Lipschitz constants of neural networks allow for guarantees of robustness in image classification, safety in controller design, and generalizability beyond the training data.
As calculating Lipschitz constants is NP-hard, techniques for estimating Lipschitz constants must navigate the trade-off between scalability and accuracy.
In this work, we significantly push the scalability frontier of a semidefinite programming technique known as LipSDP while achieving zero accuracy loss.
arXiv Detail & Related papers (2022-04-02T11:57:52Z) - Training Wasserstein GANs without gradient penalties [4.0489350374378645]
We propose a stable method to train Wasserstein generative adversarial networks.
We experimentally show that this algorithm can effectively enforce the Lipschitz constraint on the discriminator.
Our method requires no gradient penalties and is computationally more efficient than other methods.
arXiv Detail & Related papers (2021-10-27T03:46:13Z) - On the expressivity of bi-Lipschitz normalizing flows [49.92565116246822]
An invertible function is bi-Lipschitz if both the function and its inverse have bounded Lipschitz constants.
Most Normalizing Flows are bi-Lipschitz by design or by training to limit numerical errors.
arXiv Detail & Related papers (2021-07-15T10:13:46Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - Projection Robust Wasserstein Distance and Riemannian Optimization [107.93250306339694]
We show that projection robustly solidstein (PRW) is a robust variant of Wasserstein projection (WPP)
This paper provides a first step into the computation of the PRW distance and provides the links between their theory and experiments on and real data.
arXiv Detail & Related papers (2020-06-12T20:40:22Z) - Achieving robustness in classification using optimal transport with
hinge regularization [7.780418853571034]
We propose a new framework for binary classification, based on optimal transport.
We learn 1-Lipschitz networks using a new loss that is an hinge regularized version of the Kantorovich-Rubinstein dual formulation for the Wasserstein distance estimation.
arXiv Detail & Related papers (2020-06-11T15:36:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.