Related papers: Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples

Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples

URL: http://arxiv.org/abs/2002.06224v4
Date: Thu, 22 Oct 2020 21:42:09 GMT
Title: Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples
Authors: Samarth Sinha, Zhengli Zhao, Anirudh Goyal, Colin Raffel, Augustus Odena
Abstract summary: We introduce a simple (one line of code) modification to the Generative Adversarial Network (GAN) training algorithm. When updating the generator parameters, we zero out the gradient contributions from the elements of the batch that the critic scores as least realistic' We show that this top-k update' procedure is a generally applicable improvement.
Score: 67.11669996924671
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a simple (one line of code) modification to the Generative Adversarial Network (GAN) training algorithm that materially improves results with no increase in computational cost: When updating the generator parameters, we simply zero out the gradient contributions from the elements of the batch that the critic scores as `least realistic'. Through experiments on many different GAN variants, we show that this `top-k update' procedure is a generally applicable improvement. In order to understand the nature of the improvement, we conduct extensive analysis on a simple mixture-of-Gaussians dataset and discover several interesting phenomena. Among these is that, when gradient updates are computed using the worst-scoring batch elements, samples can actually be pushed further away from their nearest mode. We also apply our method to recent GAN variants and improve state-of-the-art FID for conditional generation from 9.21 to 8.57 on CIFAR-10.

Related papers

Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization [52.16435732772263]
Second-order optimization has been shown to accelerate the training of deep neural networks in many applications. However, generalization properties of second-order methods are still being debated. We show for the first time that exact Gauss-Newton (GN) updates take on a tractable form in a class of deep architectures.
arXiv Detail & Related papers (2024-11-12T17:58:40Z)
SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex. We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold. We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z)
Linear Speedup of Incremental Aggregated Gradient Methods on Streaming Data [38.54333970135826]
This paper considers a type of incremental aggregated gradient (IAG) method for large-scale distributed optimization. We show that the streaming IAG method achieves linear speedup when the workers are updating frequently enough.
arXiv Detail & Related papers (2023-09-10T10:08:52Z)
ScoreMix: A Scalable Augmentation Strategy for Training GANs with Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z)
When are Iterative Gaussian Processes Reliably Accurate? [38.523693700243975]
Lanczos decompositions have achieved scalable Gaussian process inference with highly accurate point predictions. We investigate CG tolerance, preconditioner rank, and Lanczos decomposition rank. We show that LGS-BFB is a compelling for Iterative GPs, achieving convergence with fewer updates.
arXiv Detail & Related papers (2021-12-31T00:02:18Z)
TaylorGAN: Neighbor-Augmented Policy Update for Sample-Efficient Natural Language Generation [79.4205462326301]
TaylorGAN is a novel approach to score function-based natural language generation. It augments the gradient estimation by off-policy update and the first-order Taylor expansion. It enables us to train NLG models from scratch with smaller batch size.
arXiv Detail & Related papers (2020-11-27T02:26:15Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.