Variational empirical Bayes variable selection in high-dimensional logistic regression
- URL: http://arxiv.org/abs/2502.10532v1
- Date: Fri, 14 Feb 2025 19:57:13 GMT
- Title: Variational empirical Bayes variable selection in high-dimensional logistic regression
- Authors: Yiqi Tang, Ryan Martin,
- Abstract summary: We develop a novel and computationally efficient variational approximation thereof.
One such novelty is that we develop this approximation directly for the marginal distribution on the model space, rather than on the regression coefficients themselves.
We demonstrate the method's strong performance in simulations, and prove that our variational approximation inherits the strong selection consistency property satisfied by the posterior distribution that it is approximating.
- Score: 2.4032899110671955
- License:
- Abstract: Logistic regression involving high-dimensional covariates is a practically important problem. Often the goal is variable selection, i.e., determining which few of the many covariates are associated with the binary response. Unfortunately, the usual Bayesian computations can be quite challenging and expensive. Here we start with a recently proposed empirical Bayes solution, with strong theoretical convergence properties, and develop a novel and computationally efficient variational approximation thereof. One such novelty is that we develop this approximation directly for the marginal distribution on the model space, rather than on the regression coefficients themselves. We demonstrate the method's strong performance in simulations, and prove that our variational approximation inherits the strong selection consistency property satisfied by the posterior distribution that it is approximating.
Related papers
- Heteroscedastic Double Bayesian Elastic Net [1.1240642213359266]
We propose the Heteroscedastic Double Bayesian Elastic Net (HDBEN), a novel framework that jointly models the mean and log- variance.
Our approach simultaneously induces sparsity and grouping in the regression coefficients and variance parameters, capturing complex variance structures in the data.
arXiv Detail & Related papers (2025-02-04T05:44:19Z) - Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery [0.0]
We focus on distributed estimation and support recovery for high-dimensional linear quantile regression.
We transform the original quantile regression into the least-squares optimization.
An efficient algorithm is developed, which enjoys high computation and communication efficiency.
arXiv Detail & Related papers (2024-05-13T08:32:22Z) - Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data.
Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables.
We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z) - Multi-Response Heteroscedastic Gaussian Process Models and Their
Inference [1.52292571922932]
We propose a novel framework for the modeling of heteroscedastic covariance functions.
We employ variational inference to approximate the posterior and facilitate posterior predictive modeling.
We show that our proposed framework offers a robust and versatile tool for a wide array of applications.
arXiv Detail & Related papers (2023-08-29T15:06:47Z) - A flexible empirical Bayes approach to multiple linear regression and connections with penalized regression [8.663322701649454]
We introduce a new empirical Bayes approach for large-scale multiple linear regression.
Our approach combines two key ideas: the use of flexible "adaptive shrinkage" priors and variational approximations.
We show that the posterior mean from our method solves a penalized regression problem.
arXiv Detail & Related papers (2022-08-23T12:42:57Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - Mixtures of Gaussian Processes for regression under multiple prior
distributions [0.0]
We extend the idea of Mixture models for Gaussian Process regression in order to work with multiple prior beliefs at once.
We consider the usage of our approach to additionally account for the problem of prior misspecification in functional regression problems.
arXiv Detail & Related papers (2021-04-19T10:19:14Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Naive Feature Selection: a Nearly Tight Convex Relaxation for Sparse Naive Bayes [51.55826927508311]
We propose a sparse version of naive Bayes, which can be used for feature selection.
We prove that our convex relaxation bounds becomes tight as the marginal contribution of additional features decreases.
Both binary and multinomial sparse models are solvable in time almost linear in problem size.
arXiv Detail & Related papers (2019-05-23T19:30:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.