Stein Variational Gradient Descent: many-particle and long-time
asymptotics
- URL: http://arxiv.org/abs/2102.12956v1
- Date: Thu, 25 Feb 2021 16:03:04 GMT
- Title: Stein Variational Gradient Descent: many-particle and long-time
asymptotics
- Authors: Nikolas N\"usken, D.R. Michiel Renger
- Abstract summary: Stein variational gradient descent (SVGD) refers to a class of methods for Bayesian inference based on interacting particle systems.
We develop the cotangent space construction for the Stein geometry, prove its basic properties, and determine the large-deviation functional governing the many-particle limit.
We identify the Stein-Fisher information as its leading order contribution in the long-time and many-particle regime.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stein variational gradient descent (SVGD) refers to a class of methods for
Bayesian inference based on interacting particle systems. In this paper, we
consider the originally proposed deterministic dynamics as well as a stochastic
variant, each of which represent one of the two main paradigms in Bayesian
computational statistics: variational inference and Markov chain Monte Carlo.
As it turns out, these are tightly linked through a correspondence between
gradient flow structures and large-deviation principles rooted in statistical
physics. To expose this relationship, we develop the cotangent space
construction for the Stein geometry, prove its basic properties, and determine
the large-deviation functional governing the many-particle limit for the
empirical measure. Moreover, we identify the Stein-Fisher information (or
kernelised Stein discrepancy) as its leading order contribution in the
long-time and many-particle regime in the sense of $\Gamma$-convergence,
shedding some light on the finite-particle properties of SVGD. Finally, we
establish a comparison principle between the Stein-Fisher information and
RKHS-norms that might be of independent interest.
Related papers
- Stein transport for Bayesian inference [3.009591302286514]
We introduce $textitStein transport$, a novel methodology for Bayesian inference designed to efficiently push an ensemble of particles along a curve of tempered probability distributions.
The driving vector field is chosen from a reproducing kernel Hilbert space and can be derived either through a suitable kernel ridge regression formulation or as an infinitesimal optimal transport map in the Stein geometry.
We show that in comparison to SVGD, Stein transport not only often reaches more accurate posterior approximations with a significantly reduced computational budget, but that it also effectively mitigates the variance collapse phenomenon commonly observed in SVGD.
arXiv Detail & Related papers (2024-09-02T21:03:38Z) - Particle-based Variational Inference with Generalized Wasserstein
Gradient Flow [32.37056212527921]
We propose a ParVI framework, called generalized Wasserstein gradient descent (GWG)
We show that GWG exhibits strong convergence guarantees.
We also provide an adaptive version that automatically chooses Wasserstein metric to accelerate convergence.
arXiv Detail & Related papers (2023-10-25T10:05:42Z) - A Finite-Particle Convergence Rate for Stein Variational Gradient
Descent [47.6818454221125]
We provide the first finite-particle convergence rate for Stein variational descent gradient (SVGD)
Our explicit, non-asymptotic proof strategy will serve as a template for future refinements.
arXiv Detail & Related papers (2022-11-17T17:50:39Z) - De-randomizing MCMC dynamics with the diffusion Stein operator [21.815713258703575]
Approximate Bayesian inference estimates descriptors of an intractable target distribution.
We propose de-randomized kernel-based particle samplers to all diffusion-based samplers known as MCMC dynamics.
arXiv Detail & Related papers (2021-10-07T19:59:46Z) - Entropy Production and the Role of Correlations in Quantum Brownian
Motion [77.34726150561087]
We perform a study on quantum entropy production, different kinds of correlations, and their interplay in the driven Caldeira-Leggett model of quantum Brownian motion.
arXiv Detail & Related papers (2021-08-05T13:11:05Z) - Learning Equivariant Energy Based Models with Equivariant Stein
Variational Gradient Descent [80.73580820014242]
We focus on the problem of efficient sampling and learning of probability densities by incorporating symmetries in probabilistic models.
We first introduce Equivariant Stein Variational Gradient Descent algorithm -- an equivariant sampling method based on Stein's identity for sampling from densities with symmetries.
We propose new ways of improving and scaling up training of energy based models.
arXiv Detail & Related papers (2021-06-15T01:35:17Z) - Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization [106.70006655990176]
A distributional optimization problem arises widely in machine learning and statistics.
We propose a novel particle-based algorithm, dubbed as variational transport, which approximately performs Wasserstein gradient descent.
We prove that when the objective function satisfies a functional version of the Polyak-Lojasiewicz (PL) (Polyak, 1963) and smoothness conditions, variational transport converges linearly.
arXiv Detail & Related papers (2020-12-21T18:33:13Z) - The role of feature space in atomistic learning [62.997667081978825]
Physically-inspired descriptors play a key role in the application of machine-learning techniques to atomistic simulations.
We introduce a framework to compare different sets of descriptors, and different ways of transforming them by means of metrics and kernels.
We compare representations built in terms of n-body correlations of the atom density, quantitatively assessing the information loss associated with the use of low-order features.
arXiv Detail & Related papers (2020-09-06T14:12:09Z) - Sliced Kernelized Stein Discrepancy [17.159499204595527]
Kernelized Stein discrepancy (KSD) is extensively used in goodness-of-fit tests and model learning.
We propose the sliced Stein discrepancy and its scalable and kernelized variants, which employ kernel-based test functions defined on the optimal one-dimensional projections.
For model learning, we show its advantages over existing Stein discrepancy baselines by training independent component analysis models with different discrepancies.
arXiv Detail & Related papers (2020-06-30T04:58:55Z) - A diffusion approach to Stein's method on Riemannian manifolds [65.36007959755302]
We exploit the relationship between the generator of a diffusion on $mathbf M$ with target invariant measure and its characterising Stein operator.
We derive Stein factors, which bound the solution to the Stein equation and its derivatives.
We imply that the bounds for $mathbb Rm$ remain valid when $mathbf M$ is a flat manifold.
arXiv Detail & Related papers (2020-03-25T17:03:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.