Variational inference via Wasserstein gradient flows
- URL: http://arxiv.org/abs/2205.15902v3
- Date: Fri, 21 Apr 2023 16:33:14 GMT
- Title: Variational inference via Wasserstein gradient flows
- Authors: Marc Lambert, Sinho Chewi, Francis Bach, Silv\`ere Bonnabel, Philippe
Rigollet
- Abstract summary: variational inference (VI) has emerged as a central computational approach to large-scale Bayesian inference.
We propose principled methods for VI, in which $hat pi$ is taken to be a Gaussian or a mixture of Gaussians.
- Score: 10.039378223592013
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Along with Markov chain Monte Carlo (MCMC) methods, variational inference
(VI) has emerged as a central computational approach to large-scale Bayesian
inference. Rather than sampling from the true posterior $\pi$, VI aims at
producing a simple but effective approximation $\hat \pi$ to $\pi$ for which
summary statistics are easy to compute. However, unlike the well-studied MCMC
methodology, algorithmic guarantees for VI are still relatively less
well-understood. In this work, we propose principled methods for VI, in which
$\hat \pi$ is taken to be a Gaussian or a mixture of Gaussians, which rest upon
the theory of gradient flows on the Bures--Wasserstein space of Gaussian
measures. Akin to MCMC, it comes with strong theoretical guarantees when $\pi$
is log-concave.
Related papers
- Revisiting Weighted Strategy for Non-stationary Parametric Bandits and MDPs [56.246783503873225]
This paper revisits the weighted strategy for non-stationary parametric bandits.<n>We propose a simpler weight-based algorithm that is as efficient as window/restart-based algorithms.<n>Our framework can be used to improve regret bounds of other parametric bandits.
arXiv Detail & Related papers (2026-01-03T04:50:21Z) - Complexity of Markov Chain Monte Carlo for Generalized Linear Models [1.4466802614938334]
We show that for $ngtrsim d$, MCMC attains the same complexity scaling in $n$, $d$ as first-order optimization algorithms, up to sub-polynomial factors.<n>Our complexities apply to appropriately scaled priors that are not necessarily Gaussian-tailed, including Student-$t$ and flat priors, with log-posteriors that are not necessarily globally concave or gradient-Lipschitz.
arXiv Detail & Related papers (2025-12-14T16:04:27Z) - Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling [28.931489333515618]
We establish an oracle complexity of $widetildeOleft(fracdbeta2cal A2varepsilon6right)$ for simple annealed Langevin Monte Carlo algorithm.
We show that $cal A$ represents the action of a curve of probability measures interpolating the target distribution $pi$ and a readily sampleable distribution.
arXiv Detail & Related papers (2024-07-24T02:15:48Z) - Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy [23.12198546384976]
Posterior sampling provides $varepsilon$-pure differential privacy guarantees.
It does not suffer from potentially unbounded privacy breach introduced by $(varepsilon,delta)$-approximate DP.
In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo.
arXiv Detail & Related papers (2023-10-23T07:54:39Z) - Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo [104.9535542833054]
We present a scalable and effective exploration strategy based on Thompson sampling for reinforcement learning (RL)
We instead directly sample the Q function from its posterior distribution, by using Langevin Monte Carlo.
Our approach achieves better or similar results compared with state-of-the-art deep RL algorithms on several challenging exploration tasks from the Atari57 suite.
arXiv Detail & Related papers (2023-05-29T17:11:28Z) - Forward-backward Gaussian variational inference via JKO in the
Bures-Wasserstein Space [19.19325201882727]
Variational inference (VI) seeks to approximate a target distribution $pi$ by an element of a tractable family of distributions.
We develop the Forward-Backward Gaussian Variational Inference (FB-GVI) algorithm to solve Gaussian VI.
For our proposed algorithm, we obtain state-of-the-art convergence guarantees when $pi$ is log-smooth and log-concave.
arXiv Detail & Related papers (2023-04-10T19:49:50Z) - Convergence Analysis of Stochastic Gradient Descent with MCMC Estimators [8.493584276672971]
gradient descent (SGD) and its variants is essential for machine learning.
In this paper, we consider the SGD algorithm that employ the Markov Chain Monte Carlo (MCMC) estimator to compute the gradient.
It is shown that MCMC-SGD escapes from saddle points and reaches $(epsilon,epsilon1/4)$ approximate second order stationary points.
arXiv Detail & Related papers (2023-03-19T08:29:49Z) - Revisiting Weighted Strategy for Non-stationary Parametric Bandits [82.1942459195896]
This paper revisits the weighted strategy for non-stationary parametric bandits.
We propose a refined analysis framework, which produces a simpler weight-based algorithm.
Our new framework can be used to improve regret bounds of other parametric bandits.
arXiv Detail & Related papers (2023-03-05T15:11:14Z) - Solving Constrained Variational Inequalities via an Interior Point
Method [88.39091990656107]
We develop an interior-point approach to solve constrained variational inequality (cVI) problems.
We provide convergence guarantees for ACVI in two general classes of problems.
Unlike previous work in this setting, ACVI provides a means to solve cVIs when the constraints are nontrivial.
arXiv Detail & Related papers (2022-06-21T17:55:13Z) - Sampling in Combinatorial Spaces with SurVAE Flow Augmented MCMC [83.48593305367523]
Hybrid Monte Carlo is a powerful Markov Chain Monte Carlo method for sampling from complex continuous distributions.
We introduce a new approach based on augmenting Monte Carlo methods with SurVAE Flows to sample from discrete distributions.
We demonstrate the efficacy of our algorithm on a range of examples from statistics, computational physics and machine learning, and observe improvements compared to alternative algorithms.
arXiv Detail & Related papers (2021-02-04T02:21:08Z) - Stein Variational Gaussian Processes [1.6114012813668934]
We show how to use Stein variational descent (SVGD) to carry out inference in Gaussian process (GP) models with non-Gautemporal likelihoods and large data volumes.
Our method is demonstrated on benchmark problems in both regression and classification, a multimodal posterior, and an air quality example with 550,134 observations.
arXiv Detail & Related papers (2020-09-25T11:47:44Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.