Characterizing GAN Convergence Through Proximal Duality Gap
- URL: http://arxiv.org/abs/2105.04801v1
- Date: Tue, 11 May 2021 06:27:27 GMT
- Title: Characterizing GAN Convergence Through Proximal Duality Gap
- Authors: Sahil Sidheekh, Aroof Aimen, Narayanan C. Krishnan
- Abstract summary: We show theoretically that the proximal duality gap is capable of monitoring the convergence of GANs to a wider spectrum of equilibria.
We also establish the relationship between the proximal duality gap and the divergence between the real and generated data distributions for different GAN formulations.
- Score: 3.0724051098062097
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the accomplishments of Generative Adversarial Networks (GANs) in
modeling data distributions, training them remains a challenging task. A
contributing factor to this difficulty is the non-intuitive nature of the GAN
loss curves, which necessitates a subjective evaluation of the generated output
to infer training progress. Recently, motivated by game theory, duality gap has
been proposed as a domain agnostic measure to monitor GAN training. However, it
is restricted to the setting when the GAN converges to a Nash equilibrium. But
GANs need not always converge to a Nash equilibrium to model the data
distribution. In this work, we extend the notion of duality gap to proximal
duality gap that is applicable to the general context of training GANs where
Nash equilibria may not exist. We show theoretically that the proximal duality
gap is capable of monitoring the convergence of GANs to a wider spectrum of
equilibria that subsumes Nash equilibria. We also theoretically establish the
relationship between the proximal duality gap and the divergence between the
real and generated data distributions for different GAN formulations. Our
results provide new insights into the nature of GAN convergence. Finally, we
validate experimentally the usefulness of proximal duality gap for monitoring
and influencing GAN training.
Related papers
- On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks [56.78271181959529]
Kolmogorov--Arnold Networks (KANs) have gained significant attention in the deep learning community.
Empirical investigations demonstrate that KANs optimized via gradient descent (SGD) are capable of achieving near-zero training loss.
arXiv Detail & Related papers (2024-10-10T15:34:10Z) - Conjugate Gradient Method for Generative Adversarial Networks [0.0]
It is not feasible to calculate the Jensen-Shannon divergence of the density function of the data and the density function of the model of deep neural networks.
Generative adversarial networks (GANs) can be used to formulate this problem as a discriminative problem with two models, a generator and a discriminator.
We propose to apply the conjugate gradient method to solve the local Nash equilibrium problem in GANs.
arXiv Detail & Related papers (2022-03-28T04:44:45Z) - On the Nash equilibrium of moment-matching GANs for stationary Gaussian
processes [2.25477613430341]
We show that the existence of consistent Nash equilibrium depends crucially on the choice of the discriminator family.
We further study the local stability and global convergence of gradient descent-ascent methods towards consistent equilibrium.
arXiv Detail & Related papers (2022-03-14T14:30:23Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Convex Analysis of the Mean Field Langevin Dynamics [49.66486092259375]
convergence rate analysis of the mean field Langevin dynamics is presented.
$p_q$ associated with the dynamics allows us to develop a convergence theory parallel to classical results in convex optimization.
arXiv Detail & Related papers (2022-01-25T17:13:56Z) - Generalizing Graph Neural Networks on Out-Of-Distribution Graphs [51.33152272781324]
Graph Neural Networks (GNNs) are proposed without considering the distribution shifts between training and testing graphs.
In such a setting, GNNs tend to exploit subtle statistical correlations existing in the training set for predictions, even though it is a spurious correlation.
We propose a general causal representation framework, called StableGNN, to eliminate the impact of spurious correlations.
arXiv Detail & Related papers (2021-11-20T18:57:18Z) - Non-Asymptotic Error Bounds for Bidirectional GANs [10.62911757343557]
We derive nearly sharp bounds for the bidirectional GAN (BiGAN) estimation error under the Dudley distance.
This is the first theoretical guarantee for the bidirectional GAN learning approach.
arXiv Detail & Related papers (2021-10-24T00:12:03Z) - On Duality Gap as a Measure for Monitoring GAN Training [2.733700237741334]
Generative adversarial network (GAN) is among the most popular deep learning models for learning complex data distributions.
This paper presents a theoretical understanding of this limitation and proposes a more dependable estimation process for the duality gap.
arXiv Detail & Related papers (2020-12-12T04:32:52Z) - Generalization Properties of Optimal Transport GANs with Latent
Distribution Learning [52.25145141639159]
We study how the interplay between the latent distribution and the complexity of the pushforward map affects performance.
Motivated by our analysis, we advocate learning the latent distribution as well as the pushforward map within the GAN paradigm.
arXiv Detail & Related papers (2020-07-29T07:31:33Z) - Cumulant GAN [17.4556035872983]
We propose a novel loss function for training Generative Adversarial Networks (GANs)
We show that the corresponding optimization problem is equivalent to R'enyi divergence minimization.
We experimentally demonstrate that image generation is more robust relative to Wasserstein GAN.
arXiv Detail & Related papers (2020-06-11T17:23:02Z) - Global Distance-distributions Separation for Unsupervised Person
Re-identification [93.39253443415392]
Existing unsupervised ReID approaches often fail in correctly identifying the positive samples and negative samples through the distance-based matching/ranking.
We introduce a global distance-distributions separation constraint over the two distributions to encourage the clear separation of positive and negative samples from a global view.
We show that our method leads to significant improvement over the baselines and achieves the state-of-the-art performance.
arXiv Detail & Related papers (2020-06-01T07:05:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.