Generalization Error of GAN from the Discriminator's Perspective
- URL: http://arxiv.org/abs/2107.03633v1
- Date: Thu, 8 Jul 2021 06:58:43 GMT
- Title: Generalization Error of GAN from the Discriminator's Perspective
- Authors: Hongkang Yang and Weinan E
- Abstract summary: We consider a simplified GAN model with the generator replaced by a density, and analyze how the discriminator contributes to generalization.
We show that with early stopping, the generalization error measured by Wasserstein metric escapes from the curse of dimensionality, despite that in the long term, memorization is inevitable.
- Score: 9.975163460952045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generative adversarial network (GAN) is a well-known model for learning
high-dimensional distributions, but the mechanism for its generalization
ability is not understood. In particular, GAN is vulnerable to the memorization
phenomenon, the eventual convergence to the empirical distribution. We consider
a simplified GAN model with the generator replaced by a density, and analyze
how the discriminator contributes to generalization. We show that with early
stopping, the generalization error measured by Wasserstein metric escapes from
the curse of dimensionality, despite that in the long term, memorization is
inevitable. In addition, we present a hardness of learning result for WGAN.
Related papers
- Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization [61.51372812489661]
We show that LDHD generalization is generally unattainable without exploiting prior knowledge to provide appropriate inductive bias.
Applying the insights from LDHD generalization to length generalization, we explain the effectiveness of CoT as changing the structure latent space.
We also propose a principle for position embedding design to handle both the inherent LDHD generalization and the nuisances such as the data format.
arXiv Detail & Related papers (2024-10-11T15:18:43Z) - Error analysis of generative adversarial network [0.0]
We study the error convergence rate of the GAN model based on a class of functions encompassing the discriminator and generator neural networks.
By employing the Talagrand inequality and Borel-Cantelli lemma, we establish a tight convergence rate for the error of GAN.
arXiv Detail & Related papers (2023-10-23T22:39:28Z) - Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - Learning Distributions by Generative Adversarial Networks: Approximation
and Generalization [0.6768558752130311]
We study how well generative adversarial networks learn from finite samples by analyzing the convergence rates of these models.
Our analysis is based on a new inequality oracle that decomposes the estimation error of GAN into the discriminator and generator approximation errors.
For generator approximation error, we show that neural network can approximately transform a low-dimensional source distribution to a high-dimensional target distribution.
arXiv Detail & Related papers (2022-05-25T09:26:17Z) - Towards the Semantic Weak Generalization Problem in Generative Zero-Shot
Learning: Ante-hoc and Post-hoc [89.68803484284408]
We present a simple and effective strategy lowering the previously unexplored factors that limit the performance ceiling of generative Zero-Shot Learning (ZSL)
We begin by formally defining semantic generalization, then look into approaches for reducing the semantic weak generalization problem.
In the ante-hoc phase, we augment the generator's semantic input, as well as relax the fitting target of the generator.
arXiv Detail & Related papers (2022-04-24T13:54:42Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Predicting Unreliable Predictions by Shattering a Neural Network [145.3823991041987]
Piecewise linear neural networks can be split into subfunctions.
Subfunctions have their own activation pattern, domain, and empirical error.
Empirical error for the full network can be written as an expectation over subfunctions.
arXiv Detail & Related papers (2021-06-15T18:34:41Z) - Double Descent and Other Interpolation Phenomena in GANs [2.7007335372861974]
We study the generalization error as a function of latent space dimension in generative adversarial networks (GANs)
We develop a novel pseudo-supervised learning approach for GANs where the training utilizes pairs of fabricated (noise) inputs in conjunction with real output samples.
While our analysis focuses mostly on linear models, we also apply important insights for improving generalization of nonlinear, multilayer GANs.
arXiv Detail & Related papers (2021-06-07T23:07:57Z) - Forward Super-Resolution: How Can GANs Learn Hierarchical Generative
Models for Real-World Distributions [66.05472746340142]
Generative networks (GAN) are among the most successful for learning high-complexity, real-world distributions.
In this paper we show how GANs can efficiently learn to the distribution of real-life images.
arXiv Detail & Related papers (2021-06-04T17:33:29Z) - An error analysis of generative adversarial networks for learning
distributions [11.842861158282265]
generative adversarial networks (GANs) learn probability distributions from finite samples.
GANs are able to adaptively learn data distributions with low-dimensional structure or have H"older densities.
Our analysis is based on a new oracle inequality decomposing the estimation error into generator and discriminator approximation error and statistical error.
arXiv Detail & Related papers (2021-05-27T08:55:19Z) - Generalization and Memorization: The Bias Potential Model [9.975163460952045]
generative models and density estimators behave quite differently from models for learning functions.
For the bias potential model, we show that dimension-independent generalization accuracy is achievable if early stopping is adopted.
In the long term, the model either memorizes the samples or diverges.
arXiv Detail & Related papers (2020-11-29T04:04:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.