Function-space regularized R\'enyi divergences
- URL: http://arxiv.org/abs/2210.04974v1
- Date: Mon, 10 Oct 2022 19:18:04 GMT
- Title: Function-space regularized R\'enyi divergences
- Authors: Jeremiah Birrell, Yannis Pantazis, Paul Dupuis, Markos A. Katsoulakis,
Luc Rey-Bellet
- Abstract summary: We propose a new family of regularized R'enyi divergences parametrized by a variational function space.
We prove several properties of these new divergences, showing that they interpolate between the classical R'enyi divergences and IPMs.
We show that the proposed regularized R'enyi divergences inherit features from IPMs such as the ability to compare distributions that are not absolutely continuous.
- Score: 6.221019624345409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new family of regularized R\'enyi divergences parametrized not
only by the order $\alpha$ but also by a variational function space. These new
objects are defined by taking the infimal convolution of the standard R\'enyi
divergence with the integral probability metric (IPM) associated with the
chosen function space. We derive a novel dual variational representation that
can be used to construct numerically tractable divergence estimators. This
representation avoids risk-sensitive terms and therefore exhibits lower
variance, making it well-behaved when $\alpha>1$; this addresses a notable
weakness of prior approaches. We prove several properties of these new
divergences, showing that they interpolate between the classical R\'enyi
divergences and IPMs. We also study the $\alpha\to\infty$ limit, which leads to
a regularized worst-case-regret and a new variational representation in the
classical case. Moreover, we show that the proposed regularized R\'enyi
divergences inherit features from IPMs such as the ability to compare
distributions that are not absolutely continuous, e.g., empirical measures and
distributions with low-dimensional support. We present numerical results on
both synthetic and real datasets, showing the utility of these new divergences
in both estimation and GAN training applications; in particular, we demonstrate
significantly reduced variance and improved training performance.
Related papers
- A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Learning to Transfer with von Neumann Conditional Divergence [14.926485055255942]
We introduce the recently proposed von Neumann conditional divergence to improve the transferability across multiple domains.
We design novel learning objectives assuming those source tasks are observed either simultaneously or sequentially.
In both scenarios, we obtain favorable performance against state-of-the-art methods in terms of smaller generalization error on new tasks and less catastrophic forgetting on source tasks (in the sequential setup)
arXiv Detail & Related papers (2021-08-07T22:18:23Z) - R\'enyi divergence inequalities via interpolation, with applications to
generalised entropic uncertainty relations [91.3755431537592]
We investigate quantum R'enyi entropic quantities, specifically those derived from'sandwiched' divergence.
We present R'enyi mutual information decomposition rules, a new approach to the R'enyi conditional entropy tripartite chain rules and a more general bipartite comparison.
arXiv Detail & Related papers (2021-06-19T04:06:23Z) - A unified view of likelihood ratio and reparameterization gradients [91.4645013545015]
We use a first principles approach to explain that LR and RP are alternative methods of keeping track of the movement of probability mass.
We show that the space of all possible estimators combining LR and RP can be completely parameterized by a flow field.
We prove that there cannot exist a single-sample estimator of this type outside our space, thus, clarifying where we should be searching for better Monte Carlo gradient estimators.
arXiv Detail & Related papers (2021-05-31T11:53:08Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - $(f,\Gamma)$-Divergences: Interpolating between $f$-Divergences and
Integral Probability Metrics [6.221019624345409]
We develop a framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs)
We show that they can be expressed as a two-stage mass-redistribution/mass-transport process.
Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions.
arXiv Detail & Related papers (2020-11-11T18:17:09Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Variational Representations and Neural Network Estimation of R\'enyi
Divergences [4.2896536463351]
We derive a new variational formula for the R'enyi family of divergences, $R_alpha(Q|P)$, between probability measures $Q$ and $P$.
By applying this theory to neural-network estimators, we show that if a neural network family satisfies one of several strengthened versions of the universal approximation property then the corresponding R'enyi divergence estimator is consistent.
arXiv Detail & Related papers (2020-07-07T22:34:30Z) - Cumulant GAN [17.4556035872983]
We propose a novel loss function for training Generative Adversarial Networks (GANs)
We show that the corresponding optimization problem is equivalent to R'enyi divergence minimization.
We experimentally demonstrate that image generation is more robust relative to Wasserstein GAN.
arXiv Detail & Related papers (2020-06-11T17:23:02Z) - Optimal Bounds between $f$-Divergences and Integral Probability Metrics [8.401473551081748]
Families of $f$-divergences and Integral Probability Metrics are widely used to quantify similarity between probability distributions.
We systematically study the relationship between these two families from the perspective of convex duality.
We obtain new bounds while also recovering in a unified manner well-known results, such as Hoeffding's lemma.
arXiv Detail & Related papers (2020-06-10T17:39:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.