Combating Mode Collapse in GAN training: An Empirical Analysis using
Hessian Eigenvalues
- URL: http://arxiv.org/abs/2012.09673v1
- Date: Thu, 17 Dec 2020 15:40:27 GMT
- Title: Combating Mode Collapse in GAN training: An Empirical Analysis using
Hessian Eigenvalues
- Authors: Ricard Durall, Avraam Chatzimichailidis, Peter Labus and Janis Keuper
- Abstract summary: Generative adversarial networks (GAN) provide state-of-the-art results in image generation.
Despite being so powerful, GAN still remain very challenging to train an algorithm that overcomes mode collapse.
We show that mode collapse is related to the convergence towards sharp minima.
In particular, we observe how the $G$ evalues are directly correlated with the occurrence of mode collapse.
- Score: 4.779196219827507
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative adversarial networks (GANs) provide state-of-the-art results in
image generation. However, despite being so powerful, they still remain very
challenging to train. This is in particular caused by their highly non-convex
optimization space leading to a number of instabilities. Among them, mode
collapse stands out as one of the most daunting ones. This undesirable event
occurs when the model can only fit a few modes of the data distribution, while
ignoring the majority of them. In this work, we combat mode collapse using
second-order gradient information. To do so, we analyse the loss surface
through its Hessian eigenvalues, and show that mode collapse is related to the
convergence towards sharp minima. In particular, we observe how the eigenvalues
of the $G$ are directly correlated with the occurrence of mode collapse.
Finally, motivated by these findings, we design a new optimization algorithm
called nudged-Adam (NuGAN) that uses spectral information to overcome mode
collapse, leading to empirically more stable convergence properties.
Related papers
- A Contrastive Variational Graph Auto-Encoder for Node Clustering [10.52321770126932]
State-of-the-art clustering methods have numerous challenges.
Existing VGAEs do not account for the discrepancy between the inference and generative models.
Our solution has two mechanisms to control the trade-off between Feature Randomness and Feature Drift.
arXiv Detail & Related papers (2023-12-28T05:07:57Z) - Smoothly Giving up: Robustness for Simple Models [30.56684535186692]
Examples of algorithms to train such models include logistic regression and boosting.
We use $Served-Served joint convex loss functions, which tunes between canonical convex loss functions, to robustly train such models.
We also provide results for boosting a COVID-19 dataset for logistic regression, highlighting the efficacy approach across multiple relevant domains.
arXiv Detail & Related papers (2023-02-17T19:48:11Z) - Towards Practical Control of Singular Values of Convolutional Layers [65.25070864775793]
Convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control.
Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties.
We offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity.
arXiv Detail & Related papers (2022-11-24T19:09:44Z) - Instability and Local Minima in GAN Training with Kernel Discriminators [20.362912591032636]
Generative Adversarial Networks (GANs) are a widely-used tool for generative modeling of complex data.
Despite their empirical success, the training of GANs is not fully understood due to the min-max optimization of the generator and discriminator.
This paper analyzes these joint dynamics when the true samples, as well as the generated samples, are discrete, finite sets, and the discriminator is kernel-based.
arXiv Detail & Related papers (2022-08-21T18:03:06Z) - ModeRNN: Harnessing Spatiotemporal Mode Collapse in Unsupervised
Predictive Learning [75.2748374360642]
We propose ModeRNN, which introduces a novel method to learn hidden structured representations between recurrent states.
Across the entire dataset, different modes result in different responses on the mixtures of slots, which enhances the ability of ModeRNN to build structured representations.
arXiv Detail & Related papers (2021-10-08T03:47:54Z) - Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot
Study [116.05514467222544]
Generative adversarial networks (GANs) nowadays are capable of producing images of incredible realism.
One concern raised is whether the state-of-the-art GAN's learned distribution still suffers from mode collapse.
This paper explores to diagnose GAN intra-mode collapse and calibrate that, in a novel black-box setting.
arXiv Detail & Related papers (2021-07-23T06:03:55Z) - Hard-label Manifolds: Unexpected Advantages of Query Efficiency for
Finding On-manifold Adversarial Examples [67.23103682776049]
Recent zeroth order hard-label attacks on image classification models have shown comparable performance to their first-order, gradient-level alternatives.
It was recently shown in the gradient-level setting that regular adversarial examples leave the data manifold, while their on-manifold counterparts are in fact generalization errors.
We propose an information-theoretic argument based on a noisy manifold distance oracle, which leaks manifold information through the adversary's gradient estimate.
arXiv Detail & Related papers (2021-03-04T20:53:06Z) - Preventing Posterior Collapse with Levenshtein Variational Autoencoder [61.30283661804425]
We propose to replace the evidence lower bound (ELBO) with a new objective which is simple to optimize and prevents posterior collapse.
We show that Levenstein VAE produces more informative latent representations than alternative approaches to preventing posterior collapse.
arXiv Detail & Related papers (2020-04-30T13:27:26Z) - Simple and Effective Prevention of Mode Collapse in Deep One-Class
Classification [93.2334223970488]
We propose two regularizers to prevent hypersphere collapse in deep SVDD.
The first regularizer is based on injecting random noise via the standard cross-entropy loss.
The second regularizer penalizes the minibatch variance when it becomes too small.
arXiv Detail & Related papers (2020-01-24T03:44:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.