Related papers: A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows

A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows

URL: http://arxiv.org/abs/2507.04330v1
Date: Sun, 06 Jul 2025 10:34:38 GMT
Title: A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows
Authors: Francesca Romana Crucinio,
Abstract summary: We consider the problem of sampling from a probability distribution $pi$.<n>We show that the Kullback--Leibler divergence is the only divergence in the family of Bregman divergences.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider the problem of sampling from a probability distribution $\pi$. It is well known that this can be written as an optimisation problem over the space of probability distribution in which we aim to minimise a divergence from $\pi$. and The optimisation problem is normally solved through gradient flows in the space of probability distribution with an appropriate metric. We show that the Kullback--Leibler divergence is the only divergence in the family of Bregman divergences whose gradient flow w.r.t. many popular metrics does not require knowledge of the normalising constant of $\pi$.

Related papers

Sequential Monte Carlo approximations of Wasserstein--Fisher--Rao gradient flows [0.0]
We consider several partial differential equations whose solution is a minimiser of the Kullback--Leibler divergence from $pi$.<n>We propose a novel algorithm to approximate the Wasserstein--Fisher--Rao flow of the Kullback--Leibler divergence.
arXiv Detail & Related papers (2025-06-06T09:24:46Z)
A Connection Between Learning to Reject and Bhattacharyya Divergences [57.942664964198286]
We consider learning a joint ideal distribution over both inputs and labels.<n>We develop a link between rejection and thresholding different statistical divergences.<n>In general, we find that rejecting via a Bhattacharyya divergence is less aggressive than Chow's Rule.
arXiv Detail & Related papers (2025-05-08T14:18:42Z)
Statistical and Geometrical properties of regularized Kernel Kullback-Leibler divergence [7.273481485032721]
We study the statistical and geometrical properties of the Kullback-Leibler divergence with kernel covariance operators introduced by Bach [2022]<n>Unlike the classical Kullback-Leibler (KL) divergence that involves density ratios, the KKL compares probability distributions through covariance operators (embeddings) in a reproducible kernel Hilbert space (RKHS)<n>This novel divergence hence shares parallel but different aspects with both the standard Kullback-Leibler between probability distributions and kernel embeddings metrics such as the maximum mean discrepancy.
arXiv Detail & Related papers (2024-08-29T14:01:30Z)
Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1. This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z)
A Dimensionality Reduction Method for Finding Least Favorable Priors with a Focus on Bregman Divergence [108.28566246421742]
This paper develops a dimensionality reduction method that allows us to move the optimization to a finite-dimensional setting with an explicit bound on the dimension. In order to make progress on the problem, we restrict ourselves to Bayesian risks induced by a relatively large class of loss functions, namely Bregman divergences.
arXiv Detail & Related papers (2022-02-23T16:22:28Z)
Unadjusted Langevin algorithm for sampling a mixture of weakly smooth potentials [0.0]
We prove convergence guarantees under Poincar'e inequality or non-strongly convex outside the ball. We also provide convergence in $L_beta$-Wasserstein metric for the smoothing potential.
arXiv Detail & Related papers (2021-12-17T04:10:09Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)
Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference. Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z)
On the Robustness to Misspecification of $\alpha$-Posteriors and Their Variational Approximations [12.52149409594807]
$alpha$-posteriors and their variational approximations distort standard posterior inference by downweighting the likelihood and introducing variational approximation errors. We show that such distortions, if tuned appropriately, reduce the Kullback-Leibler (KL) divergence from the true, but perhaps infeasible, posterior distribution when there is potential parametric model misspecification.
arXiv Detail & Related papers (2021-04-16T19:11:53Z)
Integrable Nonparametric Flows [5.9774834479750805]
We introduce a method for reconstructing an infinitesimal normalizing flow given only an infinitesimal change to a probability distribution. This reverses the conventional task of normalizing flows. We discuss potential applications to problems in quantum Monte Carlo and machine learning.
arXiv Detail & Related papers (2020-12-03T16:19:52Z)
Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples. A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set. We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.