Extending Mean-Field Variational Inference via Entropic Regularization: Theory and Computation
- URL: http://arxiv.org/abs/2404.09113v2
- Date: Tue, 16 Apr 2024 18:43:22 GMT
- Title: Extending Mean-Field Variational Inference via Entropic Regularization: Theory and Computation
- Authors: Bohan Wu, David Blei,
- Abstract summary: Variational inference (VI) has emerged as a popular method for approximate inference for high-dimensional Bayesian models.
We propose a novel VI method that extends the naive mean field via entropic regularization.
We show that $Xi$-variational posteriors effectively recover the true posterior dependency.
- Score: 2.2656885622116394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Variational inference (VI) has emerged as a popular method for approximate inference for high-dimensional Bayesian models. In this paper, we propose a novel VI method that extends the naive mean field via entropic regularization, referred to as $\Xi$-variational inference ($\Xi$-VI). $\Xi$-VI has a close connection to the entropic optimal transport problem and benefits from the computationally efficient Sinkhorn algorithm. We show that $\Xi$-variational posteriors effectively recover the true posterior dependency, where the dependence is downweighted by the regularization parameter. We analyze the role of dimensionality of the parameter space on the accuracy of $\Xi$-variational approximation and how it affects computational considerations, providing a rough characterization of the statistical-computational trade-off in $\Xi$-VI. We also investigate the frequentist properties of $\Xi$-VI and establish results on consistency, asymptotic normality, high-dimensional asymptotics, and algorithmic stability. We provide sufficient criteria for achieving polynomial-time approximate inference using the method. Finally, we demonstrate the practical advantage of $\Xi$-VI over mean-field variational inference on simulated and real data.
Related papers
- Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - On the Convergence of Coordinate Ascent Variational Inference [11.166959724276337]
We consider the common coordinate ascent variational inference (CAVI) algorithm for implementing the mean-field (MF) VI.
We provide general conditions for certifying global or local exponential convergence of CAVI.
New notion of generalized correlation for characterizing the interaction between the constituting blocks in influencing the VI objective functional is introduced.
arXiv Detail & Related papers (2023-06-01T20:19:30Z) - VI-DGP: A variational inference method with deep generative prior for
solving high-dimensional inverse problems [0.7734726150561089]
We propose a novel approximation method for estimating the high-dimensional posterior distribution.
This approach leverages a deep generative model to learn a prior model capable of generating spatially-varying parameters.
The proposed method can be fully implemented in an automatic differentiation manner.
arXiv Detail & Related papers (2023-02-22T06:48:10Z) - Manifold Gaussian Variational Bayes on the Precision Matrix [70.44024861252554]
We propose an optimization algorithm for Variational Inference (VI) in complex models.
We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix.
Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models.
arXiv Detail & Related papers (2022-10-26T10:12:31Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Efficient Semi-Implicit Variational Inference [65.07058307271329]
We propose an efficient and scalable semi-implicit extrapolational (SIVI)
Our method maps SIVI's evidence to a rigorous inference of lower gradient values.
arXiv Detail & Related papers (2021-01-15T11:39:09Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Statistical Guarantees for Transformation Based Models with Applications
to Implicit Variational Inference [8.333191406788423]
We provide theoretical justification for the use of non-linear latent variable models (NL-LVMs) in non-parametric inference.
We use the NL-LVMs to construct an implicit family of variational distributions, deemed GP-IVI.
To the best of our knowledge, this is the first work on providing theoretical guarantees for implicit variational inference.
arXiv Detail & Related papers (2020-10-23T21:06:29Z) - f-Divergence Variational Inference [9.172478956440216]
The $f$-VI framework unifies a number of existing VI methods.
A general $f$-variational bound is derived and provides a sandwich estimate of marginal likelihood (or evidence)
A mean-field approximation scheme that generalizes the well-known coordinate ascent variational inference is also proposed for $f$-VI.
arXiv Detail & Related papers (2020-09-28T06:22:05Z) - Tight Nonparametric Convergence Rates for Stochastic Gradient Descent
under the Noiseless Linear Model [0.0]
We analyze the convergence of single-pass, fixed step-size gradient descent on the least-square risk under this model.
As a special case, we analyze an online algorithm for estimating a real function on the unit interval from the noiseless observation of its value at randomly sampled points.
arXiv Detail & Related papers (2020-06-15T08:25:50Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.