Training Generative Adversarial Networks by Solving Ordinary
Differential Equations
- URL: http://arxiv.org/abs/2010.15040v2
- Date: Sat, 28 Nov 2020 16:07:22 GMT
- Title: Training Generative Adversarial Networks by Solving Ordinary
Differential Equations
- Authors: Chongli Qin, Yan Wu, Jost Tobias Springenberg, Andrew Brock, Jeff
Donahue, Timothy P. Lillicrap, Pushmeet Kohli
- Abstract summary: We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
- Score: 54.23691425062034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The instability of Generative Adversarial Network (GAN) training has
frequently been attributed to gradient descent. Consequently, recent methods
have aimed to tailor the models and training procedures to stabilise the
discrete updates. In contrast, we study the continuous-time dynamics induced by
GAN training. Both theory and toy experiments suggest that these dynamics are
in fact surprisingly stable. From this perspective, we hypothesise that
instabilities in training GANs arise from the integration error in discretising
the continuous dynamics. We experimentally verify that well-known ODE solvers
(such as Runge-Kutta) can stabilise training - when combined with a regulariser
that controls the integration error. Our approach represents a radical
departure from previous methods which typically use adaptive optimisation and
stabilisation techniques that constrain the functional space (e.g. Spectral
Normalisation). Evaluation on CIFAR-10 and ImageNet shows that our method
outperforms several strong baselines, demonstrating its efficacy.
Related papers
- Diffusing States and Matching Scores: A New Framework for Imitation Learning [16.941612670582522]
Adversarial Imitation Learning is traditionally framed as a two-player zero-sum game between a learner and an adversarially chosen cost function.
In recent years, diffusion models have emerged as a non-adversarial alternative to GANs.
We show our approach outperforms GAN-style imitation learning baselines across various continuous control problems.
arXiv Detail & Related papers (2024-10-17T17:59:25Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Byzantine-Robust Decentralized Stochastic Optimization with Stochastic
Gradient Noise-Independent Learning Error [25.15075119957447]
We study Byzantine-robust optimization over a decentralized network, where every agent periodically communicates with its neighbors to exchange local models, and then updates its own local model by gradient descent (SGD)
The performance of such a method is affected by an unknown number of Byzantine agents, which conduct adversarially during the optimization process.
arXiv Detail & Related papers (2023-08-10T02:14:23Z) - Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level
Stability and High-Level Behavior [51.60683890503293]
We propose a theoretical framework for studying behavior cloning of complex expert demonstrations using generative modeling.
We show that pure supervised cloning can generate trajectories matching the per-time step distribution of arbitrary expert trajectories.
arXiv Detail & Related papers (2023-07-27T04:27:26Z) - Controlled Descent Training [0.0]
A novel and model-based artificial neural network (ANN) training method is developed supported by optimal control theory.
The method augments training labels in order to robustly guarantee training loss convergence and improve training convergence rate.
The applicability of the method is demonstrated on standard regression and classification problems.
arXiv Detail & Related papers (2023-03-16T10:45:24Z) - Stabilizing Machine Learning Prediction of Dynamics: Noise and
Noise-inspired Regularization [58.720142291102135]
Recent has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of chaotic dynamical systems.
In the absence of mitigating techniques, this technique can result in artificially rapid error growth, leading to inaccurate predictions and/or climate instability.
We introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training.
arXiv Detail & Related papers (2022-11-09T23:40:52Z) - Guaranteed Conservation of Momentum for Learning Particle-based Fluid
Dynamics [96.9177297872723]
We present a novel method for guaranteeing linear momentum in learned physics simulations.
We enforce conservation of momentum with a hard constraint, which we realize via antisymmetrical continuous convolutional layers.
In combination, the proposed method allows us to increase the physical accuracy of the learned simulator substantially.
arXiv Detail & Related papers (2022-10-12T09:12:59Z) - Learning Unstable Dynamics with One Minute of Data: A
Differentiation-based Gaussian Process Approach [47.045588297201434]
We show how to exploit the differentiability of Gaussian processes to create a state-dependent linearized approximation of the true continuous dynamics.
We validate our approach by iteratively learning the system dynamics of an unstable system such as a 9-D segway.
arXiv Detail & Related papers (2021-03-08T05:08:47Z) - Reintroducing Straight-Through Estimators as Principled Methods for
Stochastic Binary Networks [85.94999581306827]
Training neural networks with binary weights and activations is a challenging problem due to the lack of gradients and difficulty of optimization over discrete weights.
Many successful experimental results have been achieved with empirical straight-through (ST) approaches.
At the same time, ST methods can be truly derived as estimators in the binary network (SBN) model with Bernoulli weights.
arXiv Detail & Related papers (2020-06-11T23:58:18Z) - Stabilizing Training of Generative Adversarial Nets via Langevin Stein
Variational Gradient Descent [11.329376606876101]
We propose to stabilize GAN training via a novel particle-based variational inference -- Langevin Stein variational descent gradient (LSVGD)
We show that LSVGD dynamics has an implicit regularization which is able to enhance particles' spread-out and diversity.
arXiv Detail & Related papers (2020-04-22T11:20:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.