Fast OSCAR and OWL Regression via Safe Screening Rules
- URL: http://arxiv.org/abs/2006.16433v2
- Date: Tue, 19 Oct 2021 07:05:15 GMT
- Title: Fast OSCAR and OWL Regression via Safe Screening Rules
- Authors: Runxue Bao, Bin Gu, Heng Huang
- Abstract summary: Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
- Score: 97.28167655721766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ordered Weighted $L_{1}$ (OWL) regularized regression is a new regression
analysis for high-dimensional sparse learning. Proximal gradient methods are
used as standard approaches to solve OWL regression. However, it is still a
burning issue to solve OWL regression due to considerable computational cost
and memory usage when the feature or sample size is large. In this paper, we
propose the first safe screening rule for OWL regression by exploring the order
of the primal solution with the unknown order structure via an iterative
strategy, which overcomes the difficulties of tackling the non-separable
regularizer. It effectively avoids the updates of the parameters whose
coefficients must be zero during the learning process. More importantly, the
proposed screening rule can be easily applied to standard and stochastic
proximal gradient methods. Moreover, we prove that the algorithms with our
screening rule are guaranteed to have identical results with the original
algorithms. Experimental results on a variety of datasets show that our
screening rule leads to a significant computational gain without any loss of
accuracy, compared to existing competitive algorithms.
Related papers
- A randomized algorithm to solve reduced rank operator regression [27.513149895229837]
We present and analyze an algorithm designed for addressing vector-valued regression problems involving possibly infinite-dimensional input and output spaces.
The algorithm is a randomized adaptation of reduced rank regression, a technique to optimally learn a low-rank vector-valued function.
arXiv Detail & Related papers (2023-12-28T20:29:59Z) - Iteratively Refined Behavior Regularization for Offline Reinforcement
Learning [57.10922880400715]
In this paper, we propose a new algorithm that substantially enhances behavior-regularization based on conservative policy iteration.
By iteratively refining the reference policy used for behavior regularization, conservative policy update guarantees gradually improvement.
Experimental results on the D4RL benchmark indicate that our method outperforms previous state-of-the-art baselines in most tasks.
arXiv Detail & Related papers (2023-06-09T07:46:24Z) - Scalable Gaussian-process regression and variable selection using
Vecchia approximations [3.4163060063961255]
Vecchia-based mini-batch subsampling provides unbiased gradient estimators.
We propose Vecchia-based mini-batch subsampling, which provides unbiased gradient estimators.
arXiv Detail & Related papers (2022-02-25T21:22:38Z) - The Benefits of Implicit Regularization from SGD in Least Squares
Problems [116.85246178212616]
gradient descent (SGD) exhibits strong algorithmic regularization effects in practice.
We make comparisons of the implicit regularization afforded by (unregularized) average SGD with the explicit regularization of ridge regression.
arXiv Detail & Related papers (2021-08-10T09:56:47Z) - Robust Regression Revisited: Acceleration and Improved Estimation Rates [25.54653340884806]
We study fast algorithms for statistical regression problems under the strong contamination model.
The goal is to approximately optimize a generalized linear model (GLM) given adversarially corrupted samples.
We present nearly-linear time algorithms for robust regression problems with improved runtime or estimation guarantees.
arXiv Detail & Related papers (2021-06-22T17:21:56Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - A spectral algorithm for robust regression with subgaussian rates [0.0]
We study a new linear up to quadratic time algorithm for linear regression in the absence of strong assumptions on the underlying distributions of samples.
The goal is to design a procedure which attains the optimal sub-gaussian error bound even though the data have only finite moments.
arXiv Detail & Related papers (2020-07-12T19:33:50Z) - Provable Training of a ReLU Gate with an Iterative Non-Gradient
Algorithm [0.7614628596146599]
We show provable guarantees on the training of a single ReLU gate in hitherto unexplored regimes.
We show a first-of-its-kind approximate recovery of the true label generating parameters under an (online) data-poisoning attack on the true labels.
Our guarantee is shown to be nearly optimal in the worst case and its accuracy of recovering the true weight degrades gracefully with increasing probability of attack and its magnitude.
arXiv Detail & Related papers (2020-05-08T17:59:23Z) - Beyond UCB: Optimal and Efficient Contextual Bandits with Regression
Oracles [112.89548995091182]
We provide the first universal and optimal reduction from contextual bandits to online regression.
Our algorithm requires no distributional assumptions beyond realizability, and works even when contexts are chosen adversarially.
arXiv Detail & Related papers (2020-02-12T11:33:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.