A Unifying Generator Loss Function for Generative Adversarial Networks
- URL: http://arxiv.org/abs/2308.07233v3
- Date: Mon, 18 Mar 2024 02:03:52 GMT
- Title: A Unifying Generator Loss Function for Generative Adversarial Networks
- Authors: Justin Veiner, Fady Alajaji, Bahman Gharesifard,
- Abstract summary: A unifying $alpha$-parametrized generator loss function is introduced for a dual-objective generative adversarial network (GAN)
The generator loss function is based on a symmetric class probability estimation type function, $mathcalL_alpha$, and the resulting GAN system is termed $mathcalL_alpha$-GAN.
- Score: 5.5575224613422725
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A unifying $\alpha$-parametrized generator loss function is introduced for a dual-objective generative adversarial network (GAN), which uses a canonical (or classical) discriminator loss function such as the one in the original GAN (VanillaGAN) system. The generator loss function is based on a symmetric class probability estimation type function, $\mathcal{L}_\alpha$, and the resulting GAN system is termed $\mathcal{L}_\alpha$-GAN. Under an optimal discriminator, it is shown that the generator's optimization problem consists of minimizing a Jensen-$f_\alpha$-divergence, a natural generalization of the Jensen-Shannon divergence, where $f_\alpha$ is a convex function expressed in terms of the loss function $\mathcal{L}_\alpha$. It is also demonstrated that this $\mathcal{L}_\alpha$-GAN problem recovers as special cases a number of GAN problems in the literature, including VanillaGAN, Least Squares GAN (LSGAN), Least $k$th order GAN (L$k$GAN) and the recently introduced $(\alpha_D,\alpha_G)$-GAN with $\alpha_D=1$. Finally, experimental results are conducted on three datasets, MNIST, CIFAR-10, and Stacked MNIST to illustrate the performance of various examples of the $\mathcal{L}_\alpha$-GAN system.
Related papers
- Addressing GAN Training Instabilities via Tunable Classification Losses [8.151943266391493]
Generative adversarial networks (GANs) allow generating synthetic data with formal guarantees.
We show that all symmetric $f$-divergences are equivalent in convergence.
We also highlight the value of tuning $(alpha_D,alpha_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring.
arXiv Detail & Related papers (2023-10-27T17:29:07Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - $(\alpha_D,\alpha_G)$-GANs: Addressing GAN Training Instabilities via
Dual Objectives [7.493779672689531]
We introduce a class of dual-objective GANs with different value functions (objectives) for the generator (G) and discriminator (D)
We show that the resulting non-zero sum game simplifies to minimize an $f$-divergence under appropriate conditions on $(alpha_D,alpha_G)$.
We highlight the value of tuning $(alpha_D,alpha_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring and the Stacked MNIST datasets.
arXiv Detail & Related papers (2023-02-28T05:22:54Z) - On the Identifiability and Estimation of Causal Location-Scale Noise
Models [122.65417012597754]
We study the class of location-scale or heteroscedastic noise models (LSNMs)
We show the causal direction is identifiable up to some pathological cases.
We propose two estimators for LSNMs: an estimator based on (non-linear) feature maps, and one based on neural networks.
arXiv Detail & Related papers (2022-10-13T17:18:59Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Generative Adversarial Neural Operators [59.21759531471597]
We propose the generative adversarial neural operator (GANO), a generative model paradigm for learning probabilities on infinite-dimensional function spaces.
GANO consists of two main components, a generator neural operator and a discriminator neural functional.
We empirically study GANOs in controlled cases where both input and output functions are samples from GRFs and compare its performance to the finite-dimensional counterpart GAN.
arXiv Detail & Related papers (2022-05-06T05:12:22Z) - Realizing GANs via a Tunable Loss Function [7.455546102930911]
We introduce a tunable GAN, called $alpha$-GAN, parameterized by $alpha in (0,infty]$.
We show that $alpha$-GAN is intimately related to the Arimoto divergence.
arXiv Detail & Related papers (2021-06-09T17:18:21Z) - Convergence and Sample Complexity of SGD in GANs [15.25030172685628]
We provide convergence guarantees on training Generative Adversarial Networks (GANs) via SGD.
We consider learning a target distribution modeled by a 1-layer Generator network with a non-linear activation function.
Our results apply to a broad class of non-linear activation functions $phi$, including ReLUs and is enabled by a connection with truncated statistics.
arXiv Detail & Related papers (2020-12-01T18:50:38Z) - Least $k$th-Order and R\'{e}nyi Generative Adversarial Networks [12.13405065406781]
Experimental results indicate that the proposed loss functions, applied to the MNIST and CelebA datasets, confer performance benefits by virtue of the extra degrees of freedom provided by the parameters $k$ and $alpha$, respectively.
While it was applied to GANs in this study, the proposed approach is generic and can be used in other applications of information theory to deep learning, e.g., the issues of fairness or privacy in artificial intelligence.
arXiv Detail & Related papers (2020-06-03T18:44:05Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z) - Gaussian Error Linear Units (GELUs) [58.195342948092964]
We propose a neural network activation function that weights inputs by their value, rather than gates by their sign.
We find performance improvements across all considered computer vision, natural language processing, and speech tasks.
arXiv Detail & Related papers (2016-06-27T19:20:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.