Related papers: Least $k$th-Order and R\'{e}nyi Generative Adversarial Networks

Least $k$th-Order and R\'{e}nyi Generative Adversarial Networks

URL: http://arxiv.org/abs/2006.02479v3
Date: Thu, 11 Mar 2021 23:37:47 GMT
Title: Least $k$th-Order and R\'{e}nyi Generative Adversarial Networks
Authors: Himesh Bhatia, William Paul, Fady Alajaji, Bahman Gharesifard, Philippe Burlina
Abstract summary: Experimental results indicate that the proposed loss functions, applied to the MNIST and CelebA datasets, confer performance benefits by virtue of the extra degrees of freedom provided by the parameters $k$ and $alpha$, respectively. While it was applied to GANs in this study, the proposed approach is generic and can be used in other applications of information theory to deep learning, e.g., the issues of fairness or privacy in artificial intelligence.
Score: 12.13405065406781
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We investigate the use of parametrized families of information-theoretic measures to generalize the loss functions of generative adversarial networks (GANs) with the objective of improving performance. A new generator loss function, called least $k$th-order GAN (L$k$GAN), is first introduced, generalizing the least squares GANs (LSGANs) by using a $k$th order absolute error distortion measure with $k \geq 1$ (which recovers the LSGAN loss function when $k=2$). It is shown that minimizing this generalized loss function under an (unconstrained) optimal discriminator is equivalent to minimizing the $k$th-order Pearson-Vajda divergence. Another novel GAN generator loss function is next proposed in terms of R\'{e}nyi cross-entropy functionals with order $\alpha >0$, $\alpha\neq 1$. It is demonstrated that this R\'{e}nyi-centric generalized loss function, which provably reduces to the original GAN loss function as $\alpha\to1$, preserves the equilibrium point satisfied by the original GAN based on the Jensen-R\'{e}nyi divergence, a natural extension of the Jensen-Shannon divergence. Experimental results indicate that the proposed loss functions, applied to the MNIST and CelebA datasets, under both DCGAN and StyleGAN architectures, confer performance benefits by virtue of the extra degrees of freedom provided by the parameters $k$ and $\alpha$, respectively. More specifically, experiments show improvements with regard to the quality of the generated images as measured by the Fr\'echet Inception Distance (FID) score and training stability. While it was applied to GANs in this study, the proposed approach is generic and can be used in other applications of information theory to deep learning, e.g., the issues of fairness or privacy in artificial intelligence.

Related papers

Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks. We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks. Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z)
Addressing GAN Training Instabilities via Tunable Classification Losses [8.151943266391493]
Generative adversarial networks (GANs) allow generating synthetic data with formal guarantees. We show that all symmetric $f$-divergences are equivalent in convergence. We also highlight the value of tuning $(alpha_D,alpha_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring.
arXiv Detail & Related papers (2023-10-27T17:29:07Z)
A Unifying Generator Loss Function for Generative Adversarial Networks [5.5575224613422725]
A unifying $alpha$-parametrized generator loss function is introduced for a dual-objective generative adversarial network (GAN) The generator loss function is based on a symmetric class probability estimation type function, $mathcalL_alpha$, and the resulting GAN system is termed $mathcalL_alpha$-GAN.
arXiv Detail & Related papers (2023-08-14T16:16:31Z)
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward [66.81579829897392]
We propose a novel offline reinforcement learning algorithm called Pessimistic vAlue iteRaTion with rEward Decomposition (PARTED) PARTED decomposes the trajectory return into per-step proxy rewards via least-squares-based reward redistribution, and then performs pessimistic value based on the learned proxy reward. To the best of our knowledge, PARTED is the first offline RL algorithm that is provably efficient in general MDP with trajectory-wise reward.
arXiv Detail & Related papers (2022-06-13T19:11:22Z)
$\alpha$-GAN: Convergence and Estimation Guarantees [7.493779672689531]
We prove a correspondence between the min-max optimization of general CPE loss function GANs and the minimization of associated $f$-divergences. We then focus on $alpha$-GAN, defined via the $alpha$-loss, which interpolates several GANs and corresponds to the minimization of the Arimoto divergence.
arXiv Detail & Related papers (2022-05-12T23:26:51Z)
An Empirical Study on GANs with Margin Cosine Loss and Relativistic Discriminator [4.899818550820575]
We introduce a new loss function, namely Relativistic Margin Cosine Loss (RMCosGAN) We compare RMCosGAN performance with existing loss functions based on two metrics: Frechet inception distance and inception score. The experimental results show that RMCosGAN outperforms the existing ones and significantly improves the quality of images generated.
arXiv Detail & Related papers (2021-10-21T17:25:47Z)
Realizing GANs via a Tunable Loss Function [7.455546102930911]
We introduce a tunable GAN, called $alpha$-GAN, parameterized by $alpha in (0,infty]$. We show that $alpha$-GAN is intimately related to the Arimoto divergence.
arXiv Detail & Related papers (2021-06-09T17:18:21Z)
Self Sparse Generative Adversarial Networks [73.590634413751]
Generative Adversarial Networks (GANs) are an unsupervised generative model that learns data distribution through adversarial training. We propose a Self Sparse Generative Adversarial Network (Self-Sparse GAN) that reduces the parameter space and alleviates the zero gradient problem.
arXiv Detail & Related papers (2021-01-26T04:49:12Z)
Cumulant GAN [17.4556035872983]
We propose a novel loss function for training Generative Adversarial Networks (GANs) We show that the corresponding optimization problem is equivalent to R'enyi divergence minimization. We experimentally demonstrate that image generation is more robust relative to Wasserstein GAN.
arXiv Detail & Related papers (2020-06-11T17:23:02Z)
Discriminator Contrastive Divergence: Semi-Amortized Generative Modeling by Exploring Energy of the Discriminator [85.68825725223873]
Generative Adversarial Networks (GANs) have shown great promise in modeling high dimensional data. We introduce the Discriminator Contrastive Divergence, which is well motivated by the property of WGAN's discriminator. We demonstrate the benefits of significant improved generation on both synthetic data and several real-world image generation benchmarks.
arXiv Detail & Related papers (2020-04-05T01:50:16Z)
Gaussian Error Linear Units (GELUs) [58.195342948092964]
We propose a neural network activation function that weights inputs by their value, rather than gates by their sign. We find performance improvements across all considered computer vision, natural language processing, and speech tasks.
arXiv Detail & Related papers (2016-06-27T19:20:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.