Stabilizing Training of Generative Adversarial Nets via Langevin Stein
Variational Gradient Descent
- URL: http://arxiv.org/abs/2004.10495v1
- Date: Wed, 22 Apr 2020 11:20:04 GMT
- Title: Stabilizing Training of Generative Adversarial Nets via Langevin Stein
Variational Gradient Descent
- Authors: Dong Wang, Xiaoqian Qin, Fengyi Song, Li Cheng
- Abstract summary: We propose to stabilize GAN training via a novel particle-based variational inference -- Langevin Stein variational descent gradient (LSVGD)
We show that LSVGD dynamics has an implicit regularization which is able to enhance particles' spread-out and diversity.
- Score: 11.329376606876101
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative adversarial networks (GANs), famous for the capability of learning
complex underlying data distribution, are however known to be tricky in the
training process, which would probably result in mode collapse or performance
deterioration. Current approaches of dealing with GANs' issues almost utilize
some practical training techniques for the purpose of regularization, which on
the other hand undermines the convergence and theoretical soundness of GAN. In
this paper, we propose to stabilize GAN training via a novel particle-based
variational inference -- Langevin Stein variational gradient descent (LSVGD),
which not only inherits the flexibility and efficiency of original SVGD but
aims to address its instability issues by incorporating an extra disturbance
into the update dynamics. We further demonstrate that by properly adjusting the
noise variance, LSVGD simulates a Langevin process whose stationary
distribution is exactly the target distribution. We also show that LSVGD
dynamics has an implicit regularization which is able to enhance particles'
spread-out and diversity. At last we present an efficient way of applying
particle-based variational inference on a general GAN training procedure no
matter what loss function is adopted. Experimental results on one synthetic
dataset and three popular benchmark datasets -- Cifar-10, Tiny-ImageNet and
CelebA validate that LSVGD can remarkably improve the performance and stability
of various GAN models.
Related papers
- Improving Data-aware and Parameter-aware Robustness for Continual Learning [3.480626767752489]
This paper analyzes that this insufficiency arises from the ineffective handling of outliers.
We propose a Robust Continual Learning (RCL) method to address this issue.
The proposed method effectively maintains robustness and achieves new state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2024-05-27T11:21:26Z) - FedDIP: Federated Learning with Extreme Dynamic Pruning and Incremental
Regularization [5.182014186927254]
Federated Learning (FL) has been successfully adopted for distributed training and inference of large-scale Deep Neural Networks (DNNs)
We contribute with a novel FL framework (coined FedDIP) which combines (i) dynamic model pruning with error feedback to eliminate redundant information exchange.
We provide convergence analysis of FedDIP and report on a comprehensive performance and comparative assessment against state-of-the-art methods.
arXiv Detail & Related papers (2023-09-13T08:51:19Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Improving GANs with A Dynamic Discriminator [106.54552336711997]
We argue that a discriminator with an on-the-fly adjustment on its capacity can better accommodate such a time-varying task.
A comprehensive empirical study confirms that the proposed training strategy, termed as DynamicD, improves the synthesis performance without incurring any additional cost or training objectives.
arXiv Detail & Related papers (2022-09-20T17:57:33Z) - FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity
in Data-Efficient GANs [24.18718734850797]
Data-Efficient GANs (DE-GANs) aim to learn generative models with a limited amount of training data.
Contrastive learning has shown the great potential of increasing the synthesis quality of DE-GANs.
We propose FakeCLR, which only applies contrastive learning on fake samples.
arXiv Detail & Related papers (2022-07-18T14:23:38Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z) - Stein Variational Inference for Discrete Distributions [70.19352762933259]
We propose a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions.
Our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo.
We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN)
In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions.
arXiv Detail & Related papers (2020-03-01T22:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.