Practical and Matching Gradient Variance Bounds for Black-Box
Variational Bayesian Inference
- URL: http://arxiv.org/abs/2303.10472v4
- Date: Sun, 4 Jun 2023 00:26:01 GMT
- Title: Practical and Matching Gradient Variance Bounds for Black-Box
Variational Bayesian Inference
- Authors: Kyurae Kim, Kaiwen Wu, Jisu Oh, Jacob R. Gardner
- Abstract summary: We show that BBVI satisfies a matching bound corresponding to the $ABC$ condition used in the gradient descent literature.
We also show that the variance of the mean-field parameterization has provably superior dimensional dependence.
- Score: 8.934639058735812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the gradient variance of black-box variational inference (BBVI)
is a crucial step for establishing its convergence and developing algorithmic
improvements. However, existing studies have yet to show that the gradient
variance of BBVI satisfies the conditions used to study the convergence of
stochastic gradient descent (SGD), the workhorse of BBVI. In this work, we show
that BBVI satisfies a matching bound corresponding to the $ABC$ condition used
in the SGD literature when applied to smooth and quadratically-growing
log-likelihoods. Our results generalize to nonlinear covariance
parameterizations widely used in the practice of BBVI. Furthermore, we show
that the variance of the mean-field parameterization has provably superior
dimensional dependence.
Related papers
- Batch and match: black-box variational inference with a score-based divergence [26.873037094654826]
We propose batch and match (BaM) as an alternative approach to blackbox variational inference (BBVI) based on a score-based divergence.
We show that BaM converges in fewer evaluations than leading implementations of BBVI based on ELBO.
arXiv Detail & Related papers (2024-02-22T18:20:22Z) - Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing? [14.2377621491791]
Black-box variational inference converges at a geometric (traditionally called "linear") rate under perfect variational family specification.
We also improve existing analysis on the regular closed-form entropy gradient estimators.
arXiv Detail & Related papers (2023-07-27T06:32:43Z) - On the Convergence of Black-Box Variational Inference [16.895490556279647]
We provide the first convergence guarantee for full black-box variational inference (BBVI)
Our results hold for log-smooth posterior densities with and without strong log-concavity and the location-scale variational family.
arXiv Detail & Related papers (2023-05-24T16:59:50Z) - A Stochastic Variance Reduced Gradient using Barzilai-Borwein Techniques
as Second Order Information [0.0]
We consider to improve the gradient variance reduce (SVRG) method via incorporating the curvature information of the objective function.
We propose to reduce the variance of gradients using the computationally efficient Barzilai-Borwein (BB) method by incorporating it into the SVRG.
We prove its linear convergence theorem that works not only for the proposed method but also for the other existing variants of SVRG with second-order information.
arXiv Detail & Related papers (2022-08-23T16:38:40Z) - Quasi Black-Box Variational Inference with Natural Gradients for
Bayesian Learning [84.90242084523565]
We develop an optimization algorithm suitable for Bayesian learning in complex models.
Our approach relies on natural gradient updates within a general black-box framework for efficient training with limited model-specific derivations.
arXiv Detail & Related papers (2022-05-23T18:54:27Z) - BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG)
Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data.
We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Efficient Semi-Implicit Variational Inference [65.07058307271329]
We propose an efficient and scalable semi-implicit extrapolational (SIVI)
Our method maps SIVI's evidence to a rigorous inference of lower gradient values.
arXiv Detail & Related papers (2021-01-15T11:39:09Z) - Statistical Guarantees for Transformation Based Models with Applications
to Implicit Variational Inference [8.333191406788423]
We provide theoretical justification for the use of non-linear latent variable models (NL-LVMs) in non-parametric inference.
We use the NL-LVMs to construct an implicit family of variational distributions, deemed GP-IVI.
To the best of our knowledge, this is the first work on providing theoretical guarantees for implicit variational inference.
arXiv Detail & Related papers (2020-10-23T21:06:29Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z) - GradientDICE: Rethinking Generalized Offline Estimation of Stationary
Values [75.17074235764757]
We present GradientDICE for estimating the density ratio between the state distribution of the target policy and the sampling distribution.
GenDICE is the state-of-the-art for estimating such density ratios.
arXiv Detail & Related papers (2020-01-29T22:10:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.