Sparse Bayesian Learning via Stepwise Regression
- URL: http://arxiv.org/abs/2106.06095v1
- Date: Fri, 11 Jun 2021 00:20:27 GMT
- Title: Sparse Bayesian Learning via Stepwise Regression
- Authors: Sebastian Ament and Carla Gomes
- Abstract summary: We propose a coordinate ascent algorithm for SBL termed Relevance Matching Pursuit (RMP)
As its noise variance parameter goes to zero, RMP exhibits a surprising connection to Stepwise Regression.
We derive novel guarantees for Stepwise Regression algorithms, which also shed light on RMP.
- Score: 1.2691047660244335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sparse Bayesian Learning (SBL) is a powerful framework for attaining sparsity
in probabilistic models. Herein, we propose a coordinate ascent algorithm for
SBL termed Relevance Matching Pursuit (RMP) and show that, as its noise
variance parameter goes to zero, RMP exhibits a surprising connection to
Stepwise Regression. Further, we derive novel guarantees for Stepwise
Regression algorithms, which also shed light on RMP. Our guarantees for Forward
Regression improve on deterministic and probabilistic results for Orthogonal
Matching Pursuit with noise. Our analysis of Backward Regression on determined
systems culminates in a bound on the residual of the optimal solution to the
subset selection problem that, if satisfied, guarantees the optimality of the
result. To our knowledge, this bound is the first that can be computed in
polynomial time and depends chiefly on the smallest singular value of the
matrix. We report numerical experiments using a variety of feature selection
algorithms. Notably, RMP and its limiting variant are both efficient and
maintain strong performance with correlated features.
Related papers
- Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics [39.07258580928359]
We study computationally and statistically efficient Reinforcement Learning algorithms for the linear Bellman Complete setting.
This setting uses linear function approximation to capture value functions and unifies existing models like linear Markov Decision Processes (MDP) and Linear Quadratic Regulators (LQR)
Our work provides a computationally efficient algorithm for the linear Bellman complete setting that works for MDPs with large action spaces, random initial states, and random rewards but relies on the underlying dynamics to be deterministic.
arXiv Detail & Related papers (2024-06-17T17:52:38Z) - A randomized algorithm to solve reduced rank operator regression [27.513149895229837]
We present and analyze an algorithm designed for addressing vector-valued regression problems involving possibly infinite-dimensional input and output spaces.
The algorithm is a randomized adaptation of reduced rank regression, a technique to optimally learn a low-rank vector-valued function.
arXiv Detail & Related papers (2023-12-28T20:29:59Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - OKRidge: Scalable Optimal k-Sparse Ridge Regression [21.17964202317435]
We propose a fast algorithm, OKRidge, for sparse ridge regression.
We also propose a method to warm-start our solver, which leverages a beam search.
arXiv Detail & Related papers (2023-04-13T17:34:44Z) - Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
Worlds in Stochastic and Deterministic Environments [48.96971760679639]
We study variance-dependent regret bounds for Markov decision processes (MDPs)
We propose two new environment norms to characterize the fine-grained variance properties of the environment.
For model-based methods, we design a variant of the MVP algorithm.
In particular, this bound is simultaneously minimax optimal for both and deterministic MDPs.
arXiv Detail & Related papers (2023-01-31T06:54:06Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Selection of the Most Probable Best [2.1095005405219815]
We consider an expected-value ranking and selection (R&S) problem where all k solutions' simulation outputs depend on a common parameter whose uncertainty can be modeled by a distribution.
We define the most probable best (MPB) to be the solution that has the largest probability of being optimal with respect to the distribution.
We devise a series of algorithms that replace the unknown means in the optimality conditions with their estimates and prove the algorithms' sampling ratios achieve the conditions as the simulation budget increases.
arXiv Detail & Related papers (2022-07-15T15:27:27Z) - Uniform-PAC Bounds for Reinforcement Learning with Linear Function
Approximation [92.3161051419884]
We study reinforcement learning with linear function approximation.
Existing algorithms only have high-probability regret and/or Probably Approximately Correct (PAC) sample complexity guarantees.
We propose a new algorithm called FLUTE, which enjoys uniform-PAC convergence to the optimal policy with high probability.
arXiv Detail & Related papers (2021-06-22T08:48:56Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.