KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint
Support
- URL: http://arxiv.org/abs/2106.08929v1
- Date: Wed, 16 Jun 2021 16:37:43 GMT
- Title: KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint
Support
- Authors: Pierre Glaser, Michael Arbel, Arthur Gretton
- Abstract summary: We study the gradient flow for a relaxed approximation to the Kullback-Leibler divergence between a moving source and a fixed target distribution.
This approximation, termed the KALE (KL approximate lower-bound estimator), solves a regularized version of the Fenchel dual problem defining the KL over a restricted class of functions.
- Score: 27.165565512841656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the gradient flow for a relaxed approximation to the
Kullback-Leibler (KL) divergence between a moving source and a fixed target
distribution. This approximation, termed the KALE (KL approximate lower-bound
estimator), solves a regularized version of the Fenchel dual problem defining
the KL over a restricted class of functions. When using a Reproducing Kernel
Hilbert Space (RKHS) to define the function class, we show that the KALE
continuously interpolates between the KL and the Maximum Mean Discrepancy
(MMD). Like the MMD and other Integral Probability Metrics, the KALE remains
well defined for mutually singular distributions. Nonetheless, the KALE
inherits from the limiting KL a greater sensitivity to mismatch in the support
of the distributions, compared with the MMD. These two properties make the KALE
gradient flow particularly well suited when the target distribution is
supported on a low-dimensional manifold. Under an assumption of sufficient
smoothness of the trajectories, we show the global convergence of the KALE
flow. We propose a particle implementation of the flow given initial samples
from the source and the target distribution, which we use to empirically
confirm the KALE's properties.
Related papers
- Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Theoretical guarantees in KL for Diffusion Flow Matching [9.618473763561418]
Flow Matching (FM) aims to bridge in finite time the target distribution $nustar$ with an auxiliary distribution $mu$.
We obtain non-asymptotics guarantees for Diffusion Flow Matching (DFM) models using as bridge the conditional distribution associated with the Brownian motion.
arXiv Detail & Related papers (2024-09-12T15:19:00Z) - Statistical and Geometrical properties of regularized Kernel Kullback-Leibler divergence [7.273481485032721]
We study the statistical and geometrical properties of the Kullback-Leibler divergence with kernel covariance operators introduced by Bach [2022]
Unlike the classical Kullback-Leibler (KL) divergence that involves density ratios, the KKL compares probability distributions through covariance operators (embeddings) in a reproducible kernel Hilbert space (RKHS)
This novel divergence hence shares parallel but different aspects with both the standard Kullback-Leibler between probability distributions and kernel embeddings metrics such as the maximum mean discrepancy.
arXiv Detail & Related papers (2024-08-29T14:01:30Z) - Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective [18.331374727331077]
We provide a theoretical understanding of the Lipschitz continuity and second momentum properties of the diffusion process.
Our results provide deeper theoretical insights into the dynamics of the diffusion process under common data distributions.
arXiv Detail & Related papers (2024-05-26T03:32:27Z) - Differentiable Annealed Importance Sampling Minimizes The Symmetrized Kullback-Leibler Divergence Between Initial and Target Distribution [10.067421338825545]
We show that DAIS minimizes the symmetrized Kullback-Leibler divergence between the initial and target distribution.
DAIS can be seen as a form of variational inference (VI) as its initial distribution is a parametric fit to an intractable target distribution.
arXiv Detail & Related papers (2024-05-23T17:55:09Z) - Soft-constrained Schrodinger Bridge: a Stochastic Control Approach [4.922305511803267]
Schr"odinger bridge can be viewed as a continuous-time control problem where the goal is to find an optimally controlled diffusion process.
We propose to generalize this problem by allowing the terminal distribution to differ from the target but penalizing the Kullback-Leibler divergence between the two distributions.
One application is the development of robust generative diffusion models.
arXiv Detail & Related papers (2024-03-04T04:10:24Z) - Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach [49.97755400231656]
We show that a novel accelerated DDPM sampler achieves accelerated performance for three broad distribution classes not considered before.
Our results show an improved dependency on the data dimension $d$ among accelerated DDPM type samplers.
arXiv Detail & Related papers (2024-02-21T16:11:47Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z) - Federated Functional Gradient Boosting [75.06942944563572]
We study functional minimization in Federated Learning.
For both FFGB.C and FFGB.L, the radii of convergence shrink to zero as the feature distributions become more homogeneous.
arXiv Detail & Related papers (2021-03-11T21:49:19Z) - Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples.
A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set.
We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.