Nearly Optimal Variational Inference for High Dimensional Regression
with Shrinkage Priors
- URL: http://arxiv.org/abs/2010.12887v1
- Date: Sat, 24 Oct 2020 12:10:27 GMT
- Title: Nearly Optimal Variational Inference for High Dimensional Regression
with Shrinkage Priors
- Authors: Jincheng Bai, Qifan Song, Guang Cheng
- Abstract summary: We propose a variational Bayesian (VB) procedure for high-dimensional linear model inferences with heavy tail shrinkage priors.
We prove that under the proper choice of prior specifications, the contraction rate of the VB posterior is nearly optimal.
It justifies the validity of VB inference as an alternative of Markov Chain Monte Carlo sampling.
- Score: 20.294908538266867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a variational Bayesian (VB) procedure for high-dimensional linear
model inferences with heavy tail shrinkage priors, such as student-t prior.
Theoretically, we establish the consistency of the proposed VB method and prove
that under the proper choice of prior specifications, the contraction rate of
the VB posterior is nearly optimal. It justifies the validity of VB inference
as an alternative of Markov Chain Monte Carlo (MCMC) sampling. Meanwhile,
comparing to conventional MCMC methods, the VB procedure achieves much higher
computational efficiency, which greatly alleviates the computing burden for
modern machine learning applications such as massive data analysis. Through
numerical studies, we demonstrate that the proposed VB method leads to shorter
computing time, higher estimation accuracy, and lower variable selection error
than competitive sparse Bayesian methods.
Related papers
- Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Low-rank extended Kalman filtering for online learning of neural
networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream.
The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix.
In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z) - Towards Practical Preferential Bayesian Optimization with Skew Gaussian
Processes [8.198195852439946]
We study preferential Bayesian optimization (BO) where reliable feedback is limited to pairwise comparison called duels.
An important challenge in preferential BO, which uses the preferential Gaussian process (GP) model to represent flexible preference structure, is that the posterior distribution is a computationally intractable skew GP.
We develop a new method that achieves both high computational efficiency and low sample complexity, and then demonstrate its effectiveness through extensive numerical experiments.
arXiv Detail & Related papers (2023-02-03T03:02:38Z) - Numerical Optimizations for Weighted Low-rank Estimation on Language
Model [73.12941276331316]
Singular value decomposition (SVD) is one of the most popular compression methods that approximates a target matrix with smaller matrices.
Standard SVD treats the parameters within the matrix with equal importance, which is a simple but unrealistic assumption.
We show that our method can perform better than current SOTA methods in neural-based language models.
arXiv Detail & Related papers (2022-11-02T00:58:02Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - A Framework for Improving the Reliability of Black-box Variational Inference [9.621959865172549]
Black-box variational inference (BBVI) now sees widespread use in machine learning and statistics.
We propose Robust and Automated Black-box VI (RABVI), a framework for improving the reliability of BBVI optimization.
RABVI is based on rigorously justified automation techniques, includes just a small number of intuitive tuning parameters, and detects inaccurate estimates of the optimal variational approximation.
arXiv Detail & Related papers (2022-03-29T23:05:40Z) - Spike and slab variational Bayes for high dimensional logistic
regression [5.371337604556311]
Variational Bayes (VB) is a popular scalable alternative to Markov chain Monte Carlo for Bayesian inference.
We provide non-asymptotic theoretical guarantees for the VB in both $ell$ and prediction loss for a sparse truth.
We confirm the improved performance of our VB algorithm over common sparse VB approaches in a numerical study.
arXiv Detail & Related papers (2020-10-22T12:49:58Z) - Fast Bayesian Estimation of Spatial Count Data Models [0.0]
We introduce Variational Bayes (VB) as an optimisation problem instead of a simulation problem.
A VB method is derived for posterior inference in negative binomial models with unobserved parameter and spatial dependence.
The VB approach is around 45 to 50 times faster than MCMC on a regular eight-core processor in a simulation and an empirical study.
arXiv Detail & Related papers (2020-07-07T10:24:45Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - Adaptive Learning of the Optimal Batch Size of SGD [52.50880550357175]
We propose a method capable of learning the optimal batch size adaptively throughout its iterations for strongly convex and smooth functions.
Our method does this provably, and in our experiments with synthetic and real data robustly exhibits nearly optimal behaviour.
We generalize our method to several new batch strategies not considered in the literature before, including a sampling suitable for distributed implementations.
arXiv Detail & Related papers (2020-05-03T14:28:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.