Spike and slab variational Bayes for high dimensional logistic
regression
- URL: http://arxiv.org/abs/2010.11665v2
- Date: Mon, 6 Sep 2021 14:00:47 GMT
- Title: Spike and slab variational Bayes for high dimensional logistic
regression
- Authors: Kolyan Ray, Botond Szabo, Gabriel Clara
- Abstract summary: Variational Bayes (VB) is a popular scalable alternative to Markov chain Monte Carlo for Bayesian inference.
We provide non-asymptotic theoretical guarantees for the VB in both $ell$ and prediction loss for a sparse truth.
We confirm the improved performance of our VB algorithm over common sparse VB approaches in a numerical study.
- Score: 5.371337604556311
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational Bayes (VB) is a popular scalable alternative to Markov chain
Monte Carlo for Bayesian inference. We study a mean-field spike and slab VB
approximation of widely used Bayesian model selection priors in sparse
high-dimensional logistic regression. We provide non-asymptotic theoretical
guarantees for the VB posterior in both $\ell_2$ and prediction loss for a
sparse truth, giving optimal (minimax) convergence rates. Since the VB
algorithm does not depend on the unknown truth to achieve optimality, our
results shed light on effective prior choices. We confirm the improved
performance of our VB algorithm over common sparse VB approaches in a numerical
study.
Related papers
- Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - SimLDA: A tool for topic model evaluation [2.6397379133308214]
We present a novel variational message passing algorithm as applied to Latent Dirichlet Allocation (LDA)
We compare it with the gold standard VB and collapsed Gibbs sampling algorithms.
Using coherence measures we show that ALBU learns latent distributions more accurately than does VB, especially for smaller data sets.
arXiv Detail & Related papers (2022-08-19T12:25:53Z) - Variational Inference for Bayesian Bridge Regression [0.0]
We study the implementation of Automatic Differentiation Variational inference (ADVI) for Bayesian inference on regression models with bridge penalization.
The bridge approach uses $ell_alpha$ norm, with $alpha in (0, +infty)$ to define a penalization on large values of the regression coefficients.
We illustrate the approach on non-parametric regression models with B-splines, although the method works seamlessly for other choices of basis functions.
arXiv Detail & Related papers (2022-05-19T12:29:09Z) - Adapting to Misspecification in Contextual Bandits [82.55565343668246]
We introduce a new family of oracle-efficient algorithms for $varepsilon$-misspecified contextual bandits.
We obtain the first algorithm that achieves the optimal $O(dsqrtT + varepsilonsqrtdT)$ regret bound for unknown misspecification level.
arXiv Detail & Related papers (2021-07-12T21:30:41Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - Nearly Optimal Variational Inference for High Dimensional Regression
with Shrinkage Priors [20.294908538266867]
We propose a variational Bayesian (VB) procedure for high-dimensional linear model inferences with heavy tail shrinkage priors.
We prove that under the proper choice of prior specifications, the contraction rate of the VB posterior is nearly optimal.
It justifies the validity of VB inference as an alternative of Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2020-10-24T12:10:27Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - The FMRIB Variational Bayesian Inference Tutorial II: Stochastic
Variational Bayes [1.827510863075184]
This tutorial revisits the original FMRIB Variational Bayes tutorial.
This new approach bears a lot of similarity to, and has benefited from, computational methods applied to machine learning algorithms.
arXiv Detail & Related papers (2020-07-03T11:31:52Z) - Statistical Foundation of Variational Bayes Neural Networks [0.456877715768796]
Variational Bayes (VB) provides a useful alternative to circumvent the computational cost and time complexity associated with the generation of samples from the true posterior.
This paper establishes the fundamental result of posterior consistency for the mean-field variational posterior (VP) for a feed-forward artificial neural network model.
arXiv Detail & Related papers (2020-06-29T03:04:18Z) - On the Convergence Rate of Projected Gradient Descent for a
Back-Projection based Objective [58.33065918353532]
We consider a back-projection based fidelity term as an alternative to the common least squares (LS)
We show that using the BP term, rather than the LS term, requires fewer iterations of optimization algorithms.
arXiv Detail & Related papers (2020-05-03T00:58:23Z) - Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks [65.24701908364383]
We show that a sufficient condition for a uncertainty on a ReLU network is "to be a bit Bayesian calibrated"
We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.
arXiv Detail & Related papers (2020-02-24T08:52:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.