A Neural Tangent Kernel Perspective of GANs
- URL: http://arxiv.org/abs/2106.05566v1
- Date: Thu, 10 Jun 2021 07:46:02 GMT
- Title: A Neural Tangent Kernel Perspective of GANs
- Authors: Jean-Yves Franceschi (MLIA), Emmanuel de B\'ezenac (MLIA), Ibrahim
Ayed (MLIA), Micka\"el Chen, Sylvain Lamprier (MLIA), Patrick Gallinari
(MLIA)
- Abstract summary: Theoretical analyses for Generative Adversarial Networks (GANs) assume an arbitrarily large family of discriminators.
We show that this framework of analysis is too simplistic to properly analyze GAN training.
We leverage the theory of infinite-width neural networks to model neural discriminator training for a wide range of adversarial losses.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Theoretical analyses for Generative Adversarial Networks (GANs) generally
assume an arbitrarily large family of discriminators and do not consider the
characteristics of the architectures used in practice. We show that this
framework of analysis is too simplistic to properly analyze GAN training. To
tackle this issue, we leverage the theory of infinite-width neural networks to
model neural discriminator training for a wide range of adversarial losses via
its Neural Tangent Kernel (NTK). Our analytical results show that GAN
trainability primarily depends on the discriminator's architecture. We further
study the discriminator for specific architectures and losses, and highlight
properties providing a new understanding of GAN training. For example, we find
that GANs trained with the integral probability metric loss minimize the
maximum mean discrepancy with the NTK as kernel. Our conclusions demonstrate
the analysis opportunities provided by the proposed framework, which paves the
way for better and more principled GAN models. We release a generic GAN
analysis toolkit based on our framework that supports the empirical part of our
study.
Related papers
- On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks [56.78271181959529]
Kolmogorov--Arnold Networks (KANs) have gained significant attention in the deep learning community.
Empirical investigations demonstrate that KANs optimized via gradient descent (SGD) are capable of achieving near-zero training loss.
arXiv Detail & Related papers (2024-10-10T15:34:10Z) - An Infinite-Width Analysis on the Jacobian-Regularised Training of a Neural Network [10.384951432591492]
Recent theoretical analysis of deep neural networks in their infinite-width limits has deepened our understanding of initialisation, feature learning, and training of those networks.
We show that this infinite-width analysis can be extended to the Jacobian of a deep neural network.
We experimentally show the relevance of our theoretical claims to wide finite networks, and empirically analyse the properties of kernel regression solution to obtain an insight into Jacobian regularisation.
arXiv Detail & Related papers (2023-12-06T09:52:18Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - Analytic Insights into Structure and Rank of Neural Network Hessian Maps [32.90143789616052]
Hessian of a neural network captures parameter interactions through second-order derivatives of the loss.
We develop theoretical tools to analyze the range of the Hessian map, providing us with a precise understanding of its rank deficiency.
This yields exact formulas and tight upper bounds for the Hessian rank of deep linear networks.
arXiv Detail & Related papers (2021-06-30T17:29:58Z) - Generalization bound of globally optimal non-convex neural network
training: Transportation map estimation by infinite dimensional Langevin
dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error.
Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Online Kernel based Generative Adversarial Networks [0.45880283710344055]
We show that Online Kernel-based Generative Adversarial Networks (OKGAN) mitigate a number of training issues, including mode collapse and cycling.
OKGANs empirically perform dramatically better, with respect to reverse KL-divergence, than other GAN formulations on synthetic data.
arXiv Detail & Related papers (2020-06-19T22:54:01Z) - A deep learning framework for solution and discovery in solid mechanics [1.4699455652461721]
We present the application of a class of deep learning, known as Physics Informed Neural Networks (PINN), to learning and discovery in solid mechanics.
We explain how to incorporate the momentum balance and elasticity relations into PINN, and explore in detail the application to linear elasticity.
arXiv Detail & Related papers (2020-02-14T08:24:53Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.