The Hessian Screening Rule
- URL: http://arxiv.org/abs/2104.13026v1
- Date: Tue, 27 Apr 2021 07:55:29 GMT
- Title: The Hessian Screening Rule
- Authors: Johan Larsson, Jonas Wallin
- Abstract summary: Hessian Screening Rule uses second-order information from the model to provide more accurate screening.
We show that the rule outperforms all other alternatives in simulated experiments with high correlation.
- Score: 5.076419064097734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predictor screening rules, which discard predictors from the design matrix
before fitting a model, have had sizable impacts on the speed with which
$\ell_1$-regularized regression problems, such as the lasso, can be solved.
Current state-of-the-art screening rules, however, have difficulties in dealing
with highly-correlated predictors, often becoming too conservative. In this
paper, we present a new screening rule to deal with this issue: the Hessian
Screening Rule. The rule uses second-order information from the model in order
to provide more accurate screening as well as higher-quality warm starts. In
our experiments on $\ell_1$-regularized least-squares (the lasso) and logistic
regression, we show that the rule outperforms all other alternatives in
simulated experiments with high correlation, as well as in the majority of real
datasets that we study.
Related papers
- Differentially Private Iterative Screening Rules for Linear Regression [45.50668718813776]
In this paper, we develop the first private screening rule for linear regression.
We find that this screening rule is too strong: it screens too many coefficients as a result of the private screening step.
However, a weakened implementation of private screening reduces overscreening and improves performance.
arXiv Detail & Related papers (2025-02-25T19:06:19Z) - Towards Optimal Statistical Watermarking [95.46650092476372]
We study statistical watermarking by formulating it as a hypothesis testing problem.
Key to our formulation is a coupling of the output tokens and the rejection region.
We characterize the Uniformly Most Powerful (UMP) watermark in the general hypothesis testing setting.
arXiv Detail & Related papers (2023-12-13T06:57:00Z) - Sequential Kernelized Independence Testing [101.22966794822084]
We design sequential kernelized independence tests inspired by kernelized dependence measures.
We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - Deep Hierarchy in Bandits [51.22833900944146]
Mean rewards of actions are often correlated.
To maximize statistical efficiency, it is important to leverage these correlations when learning.
We formulate a bandit variant of this problem where the correlations of mean action rewards are represented by a hierarchical Bayesian model.
arXiv Detail & Related papers (2022-02-03T08:15:53Z) - Safe Screening for Logistic Regression with $\ell_0$-$\ell_2$
Regularization [0.360692933501681]
We present screening rules that safely remove features from logistic regression before solving the problem.
A high percentage of the features can be effectively and safely removed apriori, leading to substantial speed-up in the computations.
arXiv Detail & Related papers (2022-02-01T15:25:54Z) - Look-Ahead Screening Rules for the Lasso [2.538209532048867]
The lasso is a popular method to induce shrinkage and sparsity in the solution vector (coefficients) of regression problems.
We present a new screening strategy: look-ahead screening.
Our method uses safe screening rules to find a range of penalty values for which a given predictor cannot enter the model, thereby screening predictors along the remainder of the path.
arXiv Detail & Related papers (2021-05-12T13:27:40Z) - Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data
to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks.
In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings.
We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z) - The Strong Screening Rule for SLOPE [5.156484100374058]
We develop a screening rule for SLOPE by examining its subdifferential and show that this rule is a generalization of the strong rule for the lasso.
Our numerical experiments show that the rule performs well in practice, leading to improvements by orders of magnitude for data in the $p gg n$ domain.
arXiv Detail & Related papers (2020-05-07T20:14:20Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z) - A General Theory of the Stochastic Linear Bandit and Its Applications [8.071506311915398]
We introduce a general analysis framework and a family of algorithms for the linear bandit problem.
Our new notion of optimism in expectation gives rise to a new algorithm, called sieved greedy (SG) that reduces the overexploration problem in OFUL.
In addition to proving that SG is theoretically rate optimal, our empirical simulations show that SG outperforms existing benchmarks such as greedy, OFUL, and TS.
arXiv Detail & Related papers (2020-02-12T18:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.