Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad
Samples
- URL: http://arxiv.org/abs/2002.06224v4
- Date: Thu, 22 Oct 2020 21:42:09 GMT
- Title: Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad
Samples
- Authors: Samarth Sinha, Zhengli Zhao, Anirudh Goyal, Colin Raffel, Augustus
Odena
- Abstract summary: We introduce a simple (one line of code) modification to the Generative Adversarial Network (GAN) training algorithm.
When updating the generator parameters, we zero out the gradient contributions from the elements of the batch that the critic scores as least realistic'
We show that this top-k update' procedure is a generally applicable improvement.
- Score: 67.11669996924671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a simple (one line of code) modification to the Generative
Adversarial Network (GAN) training algorithm that materially improves results
with no increase in computational cost: When updating the generator parameters,
we simply zero out the gradient contributions from the elements of the batch
that the critic scores as `least realistic'. Through experiments on many
different GAN variants, we show that this `top-k update' procedure is a
generally applicable improvement. In order to understand the nature of the
improvement, we conduct extensive analysis on a simple mixture-of-Gaussians
dataset and discover several interesting phenomena. Among these is that, when
gradient updates are computed using the worst-scoring batch elements, samples
can actually be pushed further away from their nearest mode. We also apply our
method to recent GAN variants and improve state-of-the-art FID for conditional
generation from 9.21 to 8.57 on CIFAR-10.
Related papers
- Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization [52.16435732772263]
Second-order optimization has been shown to accelerate the training of deep neural networks in many applications.
However, generalization properties of second-order methods are still being debated.
We show for the first time that exact Gauss-Newton (GN) updates take on a tractable form in a class of deep architectures.
arXiv Detail & Related papers (2024-11-12T17:58:40Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - Linear Speedup of Incremental Aggregated Gradient Methods on Streaming
Data [38.54333970135826]
This paper considers a type of incremental aggregated gradient (IAG) method for large-scale distributed optimization.
We show that the streaming IAG method achieves linear speedup when the workers are updating frequently enough.
arXiv Detail & Related papers (2023-09-10T10:08:52Z) - ScoreMix: A Scalable Augmentation Strategy for Training GANs with
Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available.
We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z) - When are Iterative Gaussian Processes Reliably Accurate? [38.523693700243975]
Lanczos decompositions have achieved scalable Gaussian process inference with highly accurate point predictions.
We investigate CG tolerance, preconditioner rank, and Lanczos decomposition rank.
We show that LGS-BFB is a compelling for Iterative GPs, achieving convergence with fewer updates.
arXiv Detail & Related papers (2021-12-31T00:02:18Z) - TaylorGAN: Neighbor-Augmented Policy Update for Sample-Efficient Natural
Language Generation [79.4205462326301]
TaylorGAN is a novel approach to score function-based natural language generation.
It augments the gradient estimation by off-policy update and the first-order Taylor expansion.
It enables us to train NLG models from scratch with smaller batch size.
arXiv Detail & Related papers (2020-11-27T02:26:15Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.