A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows
- URL: http://arxiv.org/abs/2507.04330v1
- Date: Sun, 06 Jul 2025 10:34:38 GMT
- Title: A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows
- Authors: Francesca Romana Crucinio,
- Abstract summary: We consider the problem of sampling from a probability distribution $pi$.<n>We show that the Kullback--Leibler divergence is the only divergence in the family of Bregman divergences.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of sampling from a probability distribution $\pi$. It is well known that this can be written as an optimisation problem over the space of probability distribution in which we aim to minimise a divergence from $\pi$. and The optimisation problem is normally solved through gradient flows in the space of probability distribution with an appropriate metric. We show that the Kullback--Leibler divergence is the only divergence in the family of Bregman divergences whose gradient flow w.r.t. many popular metrics does not require knowledge of the normalising constant of $\pi$.
Related papers
- Sequential Monte Carlo approximations of Wasserstein--Fisher--Rao gradient flows [0.0]
We consider several partial differential equations whose solution is a minimiser of the Kullback--Leibler divergence from $pi$.<n>We propose a novel algorithm to approximate the Wasserstein--Fisher--Rao flow of the Kullback--Leibler divergence.
arXiv Detail & Related papers (2025-06-06T09:24:46Z) - A Connection Between Learning to Reject and Bhattacharyya Divergences [57.942664964198286]
We consider learning a joint ideal distribution over both inputs and labels.<n>We develop a link between rejection and thresholding different statistical divergences.<n>In general, we find that rejecting via a Bhattacharyya divergence is less aggressive than Chow's Rule.
arXiv Detail & Related papers (2025-05-08T14:18:42Z) - Statistical and Geometrical properties of regularized Kernel Kullback-Leibler divergence [7.273481485032721]
We study the statistical and geometrical properties of the Kullback-Leibler divergence with kernel covariance operators introduced by Bach [2022]<n>Unlike the classical Kullback-Leibler (KL) divergence that involves density ratios, the KKL compares probability distributions through covariance operators (embeddings) in a reproducible kernel Hilbert space (RKHS)<n>This novel divergence hence shares parallel but different aspects with both the standard Kullback-Leibler between probability distributions and kernel embeddings metrics such as the maximum mean discrepancy.
arXiv Detail & Related papers (2024-08-29T14:01:30Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - A Dimensionality Reduction Method for Finding Least Favorable Priors
with a Focus on Bregman Divergence [108.28566246421742]
This paper develops a dimensionality reduction method that allows us to move the optimization to a finite-dimensional setting with an explicit bound on the dimension.
In order to make progress on the problem, we restrict ourselves to Bayesian risks induced by a relatively large class of loss functions, namely Bregman divergences.
arXiv Detail & Related papers (2022-02-23T16:22:28Z) - Unadjusted Langevin algorithm for sampling a mixture of weakly smooth
potentials [0.0]
We prove convergence guarantees under Poincar'e inequality or non-strongly convex outside the ball.
We also provide convergence in $L_beta$-Wasserstein metric for the smoothing potential.
arXiv Detail & Related papers (2021-12-17T04:10:09Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z) - On the Robustness to Misspecification of $\alpha$-Posteriors and Their
Variational Approximations [12.52149409594807]
$alpha$-posteriors and their variational approximations distort standard posterior inference by downweighting the likelihood and introducing variational approximation errors.
We show that such distortions, if tuned appropriately, reduce the Kullback-Leibler (KL) divergence from the true, but perhaps infeasible, posterior distribution when there is potential parametric model misspecification.
arXiv Detail & Related papers (2021-04-16T19:11:53Z) - Integrable Nonparametric Flows [5.9774834479750805]
We introduce a method for reconstructing an infinitesimal normalizing flow given only an infinitesimal change to a probability distribution.
This reverses the conventional task of normalizing flows.
We discuss potential applications to problems in quantum Monte Carlo and machine learning.
arXiv Detail & Related papers (2020-12-03T16:19:52Z) - Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples.
A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set.
We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.