Related papers: Nearly Optimal Variational Inference for High Dimensional Regression with Shrinkage Priors

Nearly Optimal Variational Inference for High Dimensional Regression with Shrinkage Priors

URL: http://arxiv.org/abs/2010.12887v1
Date: Sat, 24 Oct 2020 12:10:27 GMT
Title: Nearly Optimal Variational Inference for High Dimensional Regression with Shrinkage Priors
Authors: Jincheng Bai, Qifan Song, Guang Cheng
Abstract summary: We propose a variational Bayesian (VB) procedure for high-dimensional linear model inferences with heavy tail shrinkage priors. We prove that under the proper choice of prior specifications, the contraction rate of the VB posterior is nearly optimal. It justifies the validity of VB inference as an alternative of Markov Chain Monte Carlo sampling.
Score: 20.294908538266867
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a variational Bayesian (VB) procedure for high-dimensional linear model inferences with heavy tail shrinkage priors, such as student-t prior. Theoretically, we establish the consistency of the proposed VB method and prove that under the proper choice of prior specifications, the contraction rate of the VB posterior is nearly optimal. It justifies the validity of VB inference as an alternative of Markov Chain Monte Carlo (MCMC) sampling. Meanwhile, comparing to conventional MCMC methods, the VB procedure achieves much higher computational efficiency, which greatly alleviates the computing burden for modern machine learning applications such as massive data analysis. Through numerical studies, we demonstrate that the proposed VB method leads to shorter computing time, higher estimation accuracy, and lower variable selection error than competitive sparse Bayesian methods.

Related papers

Scaling Test-Time Compute Without Verification or RL is Suboptimal [70.28430200655919]
We show that finetuning LLMs with verifier-based (VB) methods based on RL or search is far superior to verifier-free (VF) approaches based on distilling or cloning search traces, given a fixed amount of compute/data budget. We corroborate our theory empirically on both didactic and math reasoning problems with 3/8B-sized pre-trained LLMs, where we find verification is crucial for scaling test-time compute.
arXiv Detail & Related papers (2025-02-17T18:43:24Z)
Bayesian Online Natural Gradient (BONG) [9.800443064368467]
We propose a novel approach to sequential Bayesian inference based on variational Bayes (VB) The key insight is that, in the online setting, we do not need to add the KL term to regularize to the prior. We show empirically that our method outperforms other online VB methods in the non-conjugate setting.
arXiv Detail & Related papers (2024-05-30T04:27:36Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
Low-rank extended Kalman filtering for online learning of neural networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream. The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix. In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z)
Towards Practical Preferential Bayesian Optimization with Skew Gaussian Processes [8.198195852439946]
We study preferential Bayesian optimization (BO) where reliable feedback is limited to pairwise comparison called duels. An important challenge in preferential BO, which uses the preferential Gaussian process (GP) model to represent flexible preference structure, is that the posterior distribution is a computationally intractable skew GP. We develop a new method that achieves both high computational efficiency and low sample complexity, and then demonstrate its effectiveness through extensive numerical experiments.
arXiv Detail & Related papers (2023-02-03T03:02:38Z)
Numerical Optimizations for Weighted Low-rank Estimation on Language Model [73.12941276331316]
Singular value decomposition (SVD) is one of the most popular compression methods that approximates a target matrix with smaller matrices. Standard SVD treats the parameters within the matrix with equal importance, which is a simple but unrealistic assumption. We show that our method can perform better than current SOTA methods in neural-based language models.
arXiv Detail & Related papers (2022-11-02T00:58:02Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
A Framework for Improving the Reliability of Black-box Variational Inference [9.621959865172549]
Black-box variational inference (BBVI) now sees widespread use in machine learning and statistics. We propose Robust and Automated Black-box VI (RABVI), a framework for improving the reliability of BBVI optimization. RABVI is based on rigorously justified automation techniques, includes just a small number of intuitive tuning parameters, and detects inaccurate estimates of the optimal variational approximation.
arXiv Detail & Related papers (2022-03-29T23:05:40Z)
Spike and slab variational Bayes for high dimensional logistic regression [5.371337604556311]
Variational Bayes (VB) is a popular scalable alternative to Markov chain Monte Carlo for Bayesian inference. We provide non-asymptotic theoretical guarantees for the VB in both $ell$ and prediction loss for a sparse truth. We confirm the improved performance of our VB algorithm over common sparse VB approaches in a numerical study.
arXiv Detail & Related papers (2020-10-22T12:49:58Z)
Fast Bayesian Estimation of Spatial Count Data Models [0.0]
We introduce Variational Bayes (VB) as an optimisation problem instead of a simulation problem. A VB method is derived for posterior inference in negative binomial models with unobserved parameter and spatial dependence. The VB approach is around 45 to 50 times faster than MCMC on a regular eight-core processor in a simulation and an empirical study.
arXiv Detail & Related papers (2020-07-07T10:24:45Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
Adaptive Learning of the Optimal Batch Size of SGD [52.50880550357175]
We propose a method capable of learning the optimal batch size adaptively throughout its iterations for strongly convex and smooth functions. Our method does this provably, and in our experiments with synthetic and real data robustly exhibits nearly optimal behaviour. We generalize our method to several new batch strategies not considered in the literature before, including a sampling suitable for distributed implementations.
arXiv Detail & Related papers (2020-05-03T14:28:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.