Least $k$th-Order and R\'{e}nyi Generative Adversarial Networks
- URL: http://arxiv.org/abs/2006.02479v3
- Date: Thu, 11 Mar 2021 23:37:47 GMT
- Title: Least $k$th-Order and R\'{e}nyi Generative Adversarial Networks
- Authors: Himesh Bhatia, William Paul, Fady Alajaji, Bahman Gharesifard,
Philippe Burlina
- Abstract summary: Experimental results indicate that the proposed loss functions, applied to the MNIST and CelebA datasets, confer performance benefits by virtue of the extra degrees of freedom provided by the parameters $k$ and $alpha$, respectively.
While it was applied to GANs in this study, the proposed approach is generic and can be used in other applications of information theory to deep learning, e.g., the issues of fairness or privacy in artificial intelligence.
- Score: 12.13405065406781
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate the use of parametrized families of information-theoretic
measures to generalize the loss functions of generative adversarial networks
(GANs) with the objective of improving performance. A new generator loss
function, called least $k$th-order GAN (L$k$GAN), is first introduced,
generalizing the least squares GANs (LSGANs) by using a $k$th order absolute
error distortion measure with $k \geq 1$ (which recovers the LSGAN loss
function when $k=2$). It is shown that minimizing this generalized loss
function under an (unconstrained) optimal discriminator is equivalent to
minimizing the $k$th-order Pearson-Vajda divergence. Another novel GAN
generator loss function is next proposed in terms of R\'{e}nyi cross-entropy
functionals with order $\alpha >0$, $\alpha\neq 1$. It is demonstrated that
this R\'{e}nyi-centric generalized loss function, which provably reduces to the
original GAN loss function as $\alpha\to1$, preserves the equilibrium point
satisfied by the original GAN based on the Jensen-R\'{e}nyi divergence, a
natural extension of the Jensen-Shannon divergence.
Experimental results indicate that the proposed loss functions, applied to
the MNIST and CelebA datasets, under both DCGAN and StyleGAN architectures,
confer performance benefits by virtue of the extra degrees of freedom provided
by the parameters $k$ and $\alpha$, respectively. More specifically,
experiments show improvements with regard to the quality of the generated
images as measured by the Fr\'echet Inception Distance (FID) score and training
stability. While it was applied to GANs in this study, the proposed approach is
generic and can be used in other applications of information theory to deep
learning, e.g., the issues of fairness or privacy in artificial intelligence.
Related papers
- Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Addressing GAN Training Instabilities via Tunable Classification Losses [8.151943266391493]
Generative adversarial networks (GANs) allow generating synthetic data with formal guarantees.
We show that all symmetric $f$-divergences are equivalent in convergence.
We also highlight the value of tuning $(alpha_D,alpha_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring.
arXiv Detail & Related papers (2023-10-27T17:29:07Z) - A Unifying Generator Loss Function for Generative Adversarial Networks [5.5575224613422725]
A unifying $alpha$-parametrized generator loss function is introduced for a dual-objective generative adversarial network (GAN)
The generator loss function is based on a symmetric class probability estimation type function, $mathcalL_alpha$, and the resulting GAN system is termed $mathcalL_alpha$-GAN.
arXiv Detail & Related papers (2023-08-14T16:16:31Z) - Provably Efficient Offline Reinforcement Learning with Trajectory-Wise
Reward [66.81579829897392]
We propose a novel offline reinforcement learning algorithm called Pessimistic vAlue iteRaTion with rEward Decomposition (PARTED)
PARTED decomposes the trajectory return into per-step proxy rewards via least-squares-based reward redistribution, and then performs pessimistic value based on the learned proxy reward.
To the best of our knowledge, PARTED is the first offline RL algorithm that is provably efficient in general MDP with trajectory-wise reward.
arXiv Detail & Related papers (2022-06-13T19:11:22Z) - $\alpha$-GAN: Convergence and Estimation Guarantees [7.493779672689531]
We prove a correspondence between the min-max optimization of general CPE loss function GANs and the minimization of associated $f$-divergences.
We then focus on $alpha$-GAN, defined via the $alpha$-loss, which interpolates several GANs and corresponds to the minimization of the Arimoto divergence.
arXiv Detail & Related papers (2022-05-12T23:26:51Z) - Realizing GANs via a Tunable Loss Function [7.455546102930911]
We introduce a tunable GAN, called $alpha$-GAN, parameterized by $alpha in (0,infty]$.
We show that $alpha$-GAN is intimately related to the Arimoto divergence.
arXiv Detail & Related papers (2021-06-09T17:18:21Z) - Self Sparse Generative Adversarial Networks [73.590634413751]
Generative Adversarial Networks (GANs) are an unsupervised generative model that learns data distribution through adversarial training.
We propose a Self Sparse Generative Adversarial Network (Self-Sparse GAN) that reduces the parameter space and alleviates the zero gradient problem.
arXiv Detail & Related papers (2021-01-26T04:49:12Z) - Cumulant GAN [17.4556035872983]
We propose a novel loss function for training Generative Adversarial Networks (GANs)
We show that the corresponding optimization problem is equivalent to R'enyi divergence minimization.
We experimentally demonstrate that image generation is more robust relative to Wasserstein GAN.
arXiv Detail & Related papers (2020-06-11T17:23:02Z) - Discriminator Contrastive Divergence: Semi-Amortized Generative Modeling
by Exploring Energy of the Discriminator [85.68825725223873]
Generative Adversarial Networks (GANs) have shown great promise in modeling high dimensional data.
We introduce the Discriminator Contrastive Divergence, which is well motivated by the property of WGAN's discriminator.
We demonstrate the benefits of significant improved generation on both synthetic data and several real-world image generation benchmarks.
arXiv Detail & Related papers (2020-04-05T01:50:16Z) - Gaussian Error Linear Units (GELUs) [58.195342948092964]
We propose a neural network activation function that weights inputs by their value, rather than gates by their sign.
We find performance improvements across all considered computer vision, natural language processing, and speech tasks.
arXiv Detail & Related papers (2016-06-27T19:20:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.