Globally-convergent Iteratively Reweighted Least Squares for Robust
Regression Problems
- URL: http://arxiv.org/abs/2006.14211v1
- Date: Thu, 25 Jun 2020 07:16:13 GMT
- Title: Globally-convergent Iteratively Reweighted Least Squares for Robust
Regression Problems
- Authors: Bhaskar Mukhoty and Govind Gopakumar and Prateek Jain and Purushottam
Kar
- Abstract summary: We provide the first global model recovery results for the IRLS (iteratively reweighted least squares) for robust regression problems.
We propose augmentations to the basic IRLS routine that not only offer guaranteed global recovery, but in practice also outperform state-of-the-art algorithms for robust regression.
- Score: 15.823258699608994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We provide the first global model recovery results for the IRLS (iteratively
reweighted least squares) heuristic for robust regression problems. IRLS is
known to offer excellent performance, despite bad initializations and data
corruption, for several parameter estimation problems. Existing analyses of
IRLS frequently require careful initialization, thus offering only local
convergence guarantees. We remedy this by proposing augmentations to the basic
IRLS routine that not only offer guaranteed global recovery, but in practice
also outperform state-of-the-art algorithms for robust regression. Our routines
are more immune to hyperparameter misspecification in basic regression tasks,
as well as applied tasks such as linear-armed bandit problems. Our theoretical
analyses rely on a novel extension of the notions of strong convexity and
smoothness to weighted strong convexity and smoothness, and establishing that
sub-Gaussian designs offer bounded weighted condition numbers. These notions
may be useful in analyzing other algorithms as well.
Related papers
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron [121.10338065441417]
We analyze a Perceptron-type algorithm called GLM-tron and provide its dimension-free risk upper bounds for high-dimensional ReLU regression.
Our results suggest that GLM-tron might be preferable to SGD for high-dimensional ReLU regression.
arXiv Detail & Related papers (2023-03-03T23:02:23Z) - A Bayesian Robust Regression Method for Corrupted Data Reconstruction [5.298637115178182]
We develop an effective robust regression method that can resist adaptive adversarial attacks.
First, we propose the novel TRIP (hard Thresholding approach to Robust regression with sImple Prior) algorithm.
We then use the idea of Bayesian reweighting to construct the more robust BRHT (robust Bayesian Reweighting regression via Hard Thresholding) algorithm.
arXiv Detail & Related papers (2022-12-24T17:25:53Z) - Stability and Generalization Analysis of Gradient Methods for Shallow
Neural Networks [59.142826407441106]
We study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability.
We consider gradient descent (GD) and gradient descent (SGD) to train SNNs, for both of which we develop consistent excess bounds.
arXiv Detail & Related papers (2022-09-19T18:48:00Z) - SLURP: Side Learning Uncertainty for Regression Problems [3.5321916087562304]
We propose SLURP, a generic approach for regression uncertainty estimation via a side learner.
We test SLURP on two critical regression tasks in computer vision: monocular depth and optical flow estimation.
arXiv Detail & Related papers (2021-10-21T14:50:42Z) - The Benefits of Implicit Regularization from SGD in Least Squares
Problems [116.85246178212616]
gradient descent (SGD) exhibits strong algorithmic regularization effects in practice.
We make comparisons of the implicit regularization afforded by (unregularized) average SGD with the explicit regularization of ridge regression.
arXiv Detail & Related papers (2021-08-10T09:56:47Z) - A general sample complexity analysis of vanilla policy gradient [101.16957584135767]
Policy gradient (PG) is one of the most popular reinforcement learning (RL) problems.
"vanilla" theoretical understanding of PG trajectory is one of the most popular methods for solving RL problems.
arXiv Detail & Related papers (2021-07-23T19:38:17Z) - Towards Understanding Generalization via Decomposing Excess Risk
Dynamics [13.4379473119565]
We analyze the generalization dynamics to derive algorithm-dependent bounds, e.g., stability.
Inspired by the observation that neural networks show a slow convergence rate when fitting noise, we propose decomposing the excess risk dynamics.
Under the decomposition framework, the new bound accords better with the theoretical and empirical evidence compared to the stability-based bound and uniform convergence bound.
arXiv Detail & Related papers (2021-06-11T03:42:45Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.