Related papers: Convergence dynamics of Generative Adversarial Networks: the dual metric flows

Convergence dynamics of Generative Adversarial Networks: the dual metric flows

URL: http://arxiv.org/abs/2012.10410v2
Date: Wed, 14 Apr 2021 16:59:17 GMT
Title: Convergence dynamics of Generative Adversarial Networks: the dual metric flows
Authors: Gabriel Turinici
Abstract summary: We investigate convergence in the Generative Adversarial Networks used in machine learning. We study the limit of small learning rate, and show that, similar to single network training, the GAN learning dynamics tend to vanish to some limit dynamics.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Fitting neural networks often resorts to stochastic (or similar) gradient descent which is a noise-tolerant (and efficient) resolution of a gradient descent dynamics. It outputs a sequence of networks parameters, which sequence evolves during the training steps. The gradient descent is the limit, when the learning rate is small and the batch size is infinite, of this set of increasingly optimal network parameters obtained during training. In this contribution, we investigate instead the convergence in the Generative Adversarial Networks used in machine learning. We study the limit of small learning rate, and show that, similar to single network training, the GAN learning dynamics tend, for vanishing learning rate to some limit dynamics. This leads us to consider evolution equations in metric spaces (which is the natural framework for evolving probability laws) that we call dual flows. We give formal definitions of solutions and prove the convergence. The theory is then applied to specific instances of GANs and we discuss how this insight helps understand and mitigate the mode collapse. Keywords: GAN; metric flow; generative network

Related papers

Metric Flows with Neural Networks [0.0]
We develop a theory of flows in the space of Riemannian metrics induced by neural network gradient descent. This is motivated in part by advances in approximating Calabi-Yau metrics with neural networks. We demonstrate that well-learned numerical metrics at finite-width exhibit an evolving metric-NTK, associated with feature learning.
arXiv Detail & Related papers (2023-10-30T18:00:01Z)
Implicit regularization of deep residual networks towards neural ODEs [8.075122862553359]
We establish an implicit regularization of deep residual networks towards neural ODEs. We prove that if the network is as a discretization of a neural ODE, then such a discretization holds throughout training.
arXiv Detail & Related papers (2023-09-03T16:35:59Z)
A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks [4.21061712600981]
We present a novel algorithm for training deep neural networks in supervised (classification and regression) and unsupervised (reinforcement learning) scenarios. This algorithm combines the standard descent gradient and the gradient clipping method. We show, in theory and through experiments, that our algorithm updates have low variance, and the training loss reduces in a smooth manner.
arXiv Detail & Related papers (2023-05-20T07:18:06Z)
The Underlying Correlated Dynamics in Neural Training [6.385006149689549]
Training of neural networks is a computationally intensive task. We propose a model based on the correlation of the parameters' dynamics, which dramatically reduces the dimensionality. This representation enhances the understanding of the underlying training dynamics and can pave the way for designing better acceleration techniques.
arXiv Detail & Related papers (2022-12-18T08:34:11Z)
Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum. Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels. They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z)
Slimmable Networks for Contrastive Self-supervised Learning [69.9454691873866]
Self-supervised learning makes significant progress in pre-training large models, but struggles with small models. We introduce another one-stage solution to obtain pre-trained small models without the need for extra teachers. A slimmable network consists of a full network and several weight-sharing sub-networks, which can be pre-trained once to obtain various networks.
arXiv Detail & Related papers (2022-09-30T15:15:05Z)
Dynamics of Fourier Modes in Torus Generative Adversarial Networks [0.8189696720657245]
Generative Adversarial Networks (GANs) are powerful Machine Learning models capable of generating fully synthetic samples of a desired phenomenon with a high resolution. Despite their success, the training process of a GAN is highly unstable and typically it is necessary to implement several accessory perturbations to the networks to reach an acceptable convergence of the model. We introduce a novel method to analyze the convergence and stability in the training of Generative Adversarial Networks.
arXiv Detail & Related papers (2022-09-05T09:03:22Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time. We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both. Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z)
Continuous-in-Depth Neural Networks [107.47887213490134]
We first show that ResNets fail to be meaningful dynamical in this richer sense. We then demonstrate that neural network models can learn to represent continuous dynamical systems. We introduce ContinuousNet as a continuous-in-depth generalization of ResNet architectures.
arXiv Detail & Related papers (2020-08-05T22:54:09Z)
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality [74.0084803220897]
Adversarial training is a popular method to give neural nets robustness against adversarial perturbations. We show convergence to low robust training loss for emphpolynomial width instead of exponential, under natural assumptions and with the ReLU activation.
arXiv Detail & Related papers (2020-02-16T20:13:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.