Related papers: $(f,\Gamma)$-Divergences: Interpolating between $f$-Divergences and Integral Probability Metrics

$(f,\Gamma)$-Divergences: Interpolating between $f$-Divergences and Integral Probability Metrics

URL: http://arxiv.org/abs/2011.05953v3
Date: Wed, 15 Sep 2021 14:25:24 GMT
Title: $(f,\Gamma)$-Divergences: Interpolating between $f$-Divergences and Integral Probability Metrics
Authors: Jeremiah Birrell, Paul Dupuis, Markos A. Katsoulakis, Yannis Pantazis, Luc Rey-Bellet
Abstract summary: We develop a framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs) We show that they can be expressed as a two-stage mass-redistribution/mass-transport process. Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions.
Score: 6.221019624345409
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We develop a rigorous and general framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs), such as the $1$-Wasserstein distance. We prove under which assumptions these divergences, hereafter referred to as $(f,\Gamma)$-divergences, provide a notion of `distance' between probability measures and show that they can be expressed as a two-stage mass-redistribution/mass-transport process. The $(f,\Gamma)$-divergences inherit features from IPMs, such as the ability to compare distributions which are not absolutely continuous, as well as from $f$-divergences, namely the strict concavity of their variational representations and the ability to control heavy-tailed distributions for particular choices of $f$. When combined, these features establish a divergence with improved properties for estimation, statistical learning, and uncertainty quantification applications. Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions. We also show improved performance and stability over gradient-penalized Wasserstein GAN in image generation.

Related papers

Wasserstein Convergence of Score-based Generative Models under Semiconvexity and Discontinuous Gradients [0.0]
Score-based Generative Models (SGMs) approximate a data distribution by perturbing it with Gaussian noise and subsequently denoising it via a learned diffusion process.<n>We establish the first non-asymotic Wasserstein-2 convergence guarantees for SGMs targeting semi-one order with potentially discontinuous gradients.
arXiv Detail & Related papers (2025-05-06T11:17:15Z)
Non-asymptotic bounds for forward processes in denoising diffusions: Ornstein-Uhlenbeck is hard to beat [49.1574468325115]
This paper presents explicit non-asymptotic bounds on the forward diffusion error in total variation (TV) We parametrise multi-modal data distributions in terms of the distance $R$ to their furthest modes and consider forward diffusions with additive and multiplicative noise.
arXiv Detail & Related papers (2024-08-25T10:28:31Z)
Robust Generative Learning with Lipschitz-Regularized $α$-Divergences Allows Minimal Assumptions on Target Distributions [12.19634962193403]
This paper demonstrates the robustness of Lipschitz-regularized $alpha$-divergences as objective functionals in generative modeling. We prove the existence and finiteness of their variational derivatives, which are essential for stable training of generative models such as GANs and gradient flows. Numerical experiments confirm that generative models leveraging Lipschitz-regularized $alpha$-divergences can stably learn distributions in various challenging scenarios.
arXiv Detail & Related papers (2024-05-22T19:58:13Z)
Gaussian-Smoothed Sliced Probability Divergences [15.123608776470077]
We show that smoothing and slicing preserve the metric property and the weak topology. We also derive other properties, including continuity, of different divergences with respect to the smoothing parameter.
arXiv Detail & Related papers (2024-04-04T07:55:46Z)
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Ensemble Multi-Quantiles: Adaptively Flexible Distribution Prediction for Uncertainty Quantification [4.728311759896569]
We propose a novel, succinct, and effective approach for distribution prediction to quantify uncertainty in machine learning. It incorporates adaptively flexible distribution prediction of $mathbbP(mathbfy|mathbfX=x)$ in regression tasks. On extensive regression tasks from UCI datasets, we show that EMQ achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-11-26T11:45:32Z)
Function-space regularized R\'enyi divergences [6.221019624345409]
We propose a new family of regularized R'enyi divergences parametrized by a variational function space. We prove several properties of these new divergences, showing that they interpolate between the classical R'enyi divergences and IPMs. We show that the proposed regularized R'enyi divergences inherit features from IPMs such as the ability to compare distributions that are not absolutely continuous.
arXiv Detail & Related papers (2022-10-10T19:18:04Z)
A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms. We develop a general framework from the perspective of Bregman minimization divergence. We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z)
GFlowNet Foundations [66.69854262276391]
Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates in an active learning context. We show a number of additional theoretical properties of GFlowNets.
arXiv Detail & Related papers (2021-11-17T17:59:54Z)
Implicit Distributional Reinforcement Learning [61.166030238490634]
implicit distributional actor-critic (IDAC) built on two deep generator networks (DGNs) Semi-implicit actor (SIA) powered by a flexible policy distribution. We observe IDAC outperforms state-of-the-art algorithms on representative OpenAI Gym environments.
arXiv Detail & Related papers (2020-07-13T02:52:18Z)
Optimal Bounds between $f$-Divergences and Integral Probability Metrics [8.401473551081748]
Families of $f$-divergences and Integral Probability Metrics are widely used to quantify similarity between probability distributions. We systematically study the relationship between these two families from the perspective of convex duality. We obtain new bounds while also recovering in a unified manner well-known results, such as Hoeffding's lemma.
arXiv Detail & Related papers (2020-06-10T17:39:11Z)
Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function. It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.