Sparse Bayesian Lasso via a Variable-Coefficient $\ell_1$ Penalty
- URL: http://arxiv.org/abs/2211.05089v3
- Date: Fri, 12 May 2023 17:04:05 GMT
- Title: Sparse Bayesian Lasso via a Variable-Coefficient $\ell_1$ Penalty
- Authors: Nathan Wycoff, Ali Arab, Katharine M. Donato and Lisa O. Singh
- Abstract summary: We define learnable penalty weights $lambda_p$ with hyperpriors.
We study the theoretical properties of this variable-co-efficient $ell_$ penalty in the context of penalized likelihood.
We develop a model we call the Sparse Bayesian Lasso which allows for behavior endowed qualitatively like Lasso regression to be applied to arbitrary variational models.
- Score: 0.9176056742068814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern statistical learning algorithms are capable of amazing flexibility,
but struggle with interpretability. One possible solution is sparsity: making
inference such that many of the parameters are estimated as being identically
0, which may be imposed through the use of nonsmooth penalties such as the
$\ell_1$ penalty. However, the $\ell_1$ penalty introduces significant bias
when high sparsity is desired. In this article, we retain the $\ell_1$ penalty,
but define learnable penalty weights $\lambda_p$ endowed with hyperpriors. We
start the article by investigating the optimization problem this poses,
developing a proximal operator associated with the $\ell_1$ norm. We then study
the theoretical properties of this variable-coefficient $\ell_1$ penalty in the
context of penalized likelihood. Next, we investigate application of this
penalty to Variational Bayes, developing a model we call the Sparse Bayesian
Lasso which allows for behavior qualitatively like Lasso regression to be
applied to arbitrary variational models. In simulation studies, this gives us
the Uncertainty Quantification and low bias properties of simulation-based
approaches with an order of magnitude less computation. Finally, we apply our
methodology to a Bayesian lagged spatiotemporal regression model of internal
displacement that occurred during the Iraqi Civil War of 2013-2017.
Related papers
- Computational-Statistical Tradeoffs at the Next-Token Prediction Barrier: Autoregressive and Imitation Learning under Misspecification [50.717692060500696]
Next-token prediction with the logarithmic loss is a cornerstone of autoregressive sequence modeling.
Next-token prediction can be made robust so as to achieve $C=tilde O(H)$, representing moderate error amplification.
No computationally efficient algorithm can achieve sub-polynomial approximation factor $C=e(log H)1-Omega(1)$.
arXiv Detail & Related papers (2025-02-18T02:52:00Z) - Proximal Iteration for Nonlinear Adaptive Lasso [1.866597543169743]
We study the approach of treating the penalty coefficients as additional decision variables to be learned in a textitMaximum a Posteriori manner.
We develop a proximal gradient approach to joint optimization of these together with the parameters of any differentiable cost function.
arXiv Detail & Related papers (2024-12-07T19:19:55Z) - Iterative Reweighted Framework Based Algorithms for Sparse Linear Regression with Generalized Elastic Net Penalty [0.3124884279860061]
elastic net penalty is frequently employed in high-dimensional statistics for parameter regression and variable selection.
empirical evidence has shown that the $ell_q$-norm penalty often provides better regression compared to the $ell_r$-norm penalty.
We develop two efficient algorithms based on the locally Lipschitz continuous $epsilon$-approximation to $ell_q$-norm.
arXiv Detail & Related papers (2024-11-22T11:55:37Z) - Beyond Closure Models: Learning Chaotic-Systems via Physics-Informed Neural Operators [78.64101336150419]
Predicting the long-term behavior of chaotic systems is crucial for various applications such as climate modeling.
An alternative approach to such a full-resolved simulation is using a coarse grid and then correcting its errors through a temporalittext model.
We propose an alternative end-to-end learning approach using a physics-informed neural operator (PINO) that overcomes this limitation.
arXiv Detail & Related papers (2024-08-09T17:05:45Z) - Fast Rates for Bandit PAC Multiclass Classification [73.17969992976501]
We study multiclass PAC learning with bandit feedback, where inputs are classified into one of $K$ possible labels and feedback is limited to whether or not the predicted labels are correct.
Our main contribution is in designing a novel learning algorithm for the agnostic $(varepsilon,delta)$PAC version of the problem.
arXiv Detail & Related papers (2024-06-18T08:54:04Z) - Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making [58.06306331390586]
We introduce the notion of a margin complement, which measures how much a prediction score $S$ changes due to a thresholding operation.
We show that under suitable causal assumptions, the influences of $X$ on the prediction score $S$ are equal to the influences of $X$ on the true outcome $Y$.
arXiv Detail & Related papers (2024-05-24T11:22:19Z) - Untangling Lariats: Subgradient Following of Variationally Penalized Objectives [10.043139484808949]
We derive, as special cases of our apparatus, known algorithms for the fused lasso and isotonic regression.
Last but not least, we derive a lattice-based subgradient solvers for variational penalties characterized through the output of arbitrary convolutional filters.
arXiv Detail & Related papers (2024-05-07T23:08:24Z) - Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM)
We study a model within this domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary.
We propose an algorithm namely robust contextual dueling bandits (RCDB), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z) - More Optimal Simulation of Universal Quantum Computers [0.0]
Worst-case sampling cost has plateaued at $le(2+sqrt2)xi_t delta-1$ in the limit that $t rightarrow infty$.
We reduce this prefactor 68-fold by a leading-order reduction in $t$ through correlated sampling.
arXiv Detail & Related papers (2022-02-02T19:00:03Z) - ReLU Regression with Massart Noise [52.10842036932169]
We study the fundamental problem of ReLU regression, where the goal is to fit Rectified Linear Units (ReLUs) to data.
We focus on ReLU regression in the Massart noise model, a natural and well-studied semi-random noise model.
We develop an efficient algorithm that achieves exact parameter recovery in this model.
arXiv Detail & Related papers (2021-09-10T02:13:22Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z) - Outlier-robust sparse/low-rank least-squares regression and robust
matrix completion [1.0878040851637998]
We study high-dimensional least-squares regression within a subgaussian statistical learning framework with heterogeneous noise.
We also present a novel theory of trace-regression with matrix decomposition based on a new application of the product process.
arXiv Detail & Related papers (2020-12-12T07:42:47Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - The Trimmed Lasso: Sparse Recovery Guarantees and Practical Optimization
by the Generalized Soft-Min Penalty [14.85926834924458]
We present a new approach to solve the sparse approximation or best subset it interpolates between the classical lasso and general patterns.
We derive a sparse-time to compute the general soft-min penalty.
arXiv Detail & Related papers (2020-05-18T18:43:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.