The Strong Screening Rule for SLOPE
- URL: http://arxiv.org/abs/2005.03730v3
- Date: Fri, 22 Apr 2022 09:38:58 GMT
- Title: The Strong Screening Rule for SLOPE
- Authors: Johan Larsson, Ma{\l}gorzata Bogdan, Jonas Wallin
- Abstract summary: We develop a screening rule for SLOPE by examining its subdifferential and show that this rule is a generalization of the strong rule for the lasso.
Our numerical experiments show that the rule performs well in practice, leading to improvements by orders of magnitude for data in the $p gg n$ domain.
- Score: 5.156484100374058
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extracting relevant features from data sets where the number of observations
($n$) is much smaller then the number of predictors ($p$) is a major challenge
in modern statistics. Sorted L-One Penalized Estimation (SLOPE), a
generalization of the lasso, is a promising method within this setting. Current
numerical procedures for SLOPE, however, lack the efficiency that respective
tools for the lasso enjoy, particularly in the context of estimating a complete
regularization path. A key component in the efficiency of the lasso is
predictor screening rules: rules that allow predictors to be discarded before
estimating the model. This is the first paper to establish such a rule for
SLOPE. We develop a screening rule for SLOPE by examining its subdifferential
and show that this rule is a generalization of the strong rule for the lasso.
Our rule is heuristic, which means that it may discard predictors erroneously.
We present conditions under which this may happen and show that such situations
are rare and easily safeguarded against by a simple check of the optimality
conditions. Our numerical experiments show that the rule performs well in
practice, leading to improvements by orders of magnitude for data in the $p \gg
n$ domain, as well as incurring no additional computational overhead when $n
\gg p$. We also examine the effect of correlation structures in the design
matrix on the rule and discuss algorithmic strategies for employing the rule.
Finally, we provide an efficient implementation of the rule in our R package
SLOPE.
Related papers
- Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering [55.15192437680943]
Generative models lack rigorous statistical guarantees for their outputs.
We propose a sequential conformal prediction method producing prediction sets that satisfy a rigorous statistical guarantee.
This guarantee states that with high probability, the prediction sets contain at least one admissible (or valid) example.
arXiv Detail & Related papers (2024-10-02T15:26:52Z) - Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making [58.06306331390586]
We introduce the notion of a margin complement, which measures how much a prediction score $S$ changes due to a thresholding operation.
We show that under suitable causal assumptions, the influences of $X$ on the prediction score $S$ are equal to the influences of $X$ on the true outcome $Y$.
arXiv Detail & Related papers (2024-05-24T11:22:19Z) - Orthogonal Gradient Boosting for Simpler Additive Rule Ensembles [10.40809014729148]
Gradient boosting of prediction rules is an efficient approach to learn potentially interpretable yet accurate probabilistic models.
We show how a new objective function measures the angle between the risk gradient vector and the projection of the condition output vector onto the complement of the already selected conditions.
This approach correctly approximates the ideal update of adding the risk gradient itself to the model and favours the inclusion of more general and thus shorter rules.
arXiv Detail & Related papers (2024-02-24T02:29:10Z) - A Screening Strategy for Structured Optimization Involving Nonconvex
$\ell_{q,p}$ Regularization [5.522202658637732]
We develop a rule strategy to improve the computational efficiency in solving structured optimization involving nonell_qp$ regularization.
We prove that our rule can remove all inactive variables in a finite number of iterations of the IRL1 method.
Numerical experiments illustrate the efficiency of our screening rule strategy compared with several state-of-the-art algorithms.
arXiv Detail & Related papers (2022-08-02T10:01:49Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - The Hessian Screening Rule [5.076419064097734]
Hessian Screening Rule uses second-order information from the model to provide more accurate screening.
We show that the rule outperforms all other alternatives in simulated experiments with high correlation.
arXiv Detail & Related papers (2021-04-27T07:55:29Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Efficient Conformal Prediction via Cascaded Inference with Expanded
Admission [43.596058175459746]
We present a novel approach for conformal prediction (CP)
We aim to identify a set of promising prediction candidates -- in place of a single prediction.
This set is guaranteed to contain a correct answer with high probability.
arXiv Detail & Related papers (2020-07-06T23:13:07Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - Interpretable Random Forests via Rule Extraction [0.0]
We introduce SIRUS, a stable rule learning algorithm which takes the form of a short and simple list of rules.
Our R/C++ software implementation sirus is available from CRAN.
arXiv Detail & Related papers (2020-04-29T08:13:35Z) - Safe RuleFit: Learning Optimal Sparse Rule Model by Meta Safe Screening [19.524479577246336]
We consider the problem of learning a sparse rule model, a prediction model in the form of a sparse linear combination of rules.
Since the number of all possible such rules is extremely large, it has been computationally intractable to select the optimal set of active rules.
We propose Safe RuleFit (SRF), which is a non-trivial extension of well-known safe screening techniques.
arXiv Detail & Related papers (2018-10-03T10:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.