Retire: Robust Expectile Regression in High Dimensions
- URL: http://arxiv.org/abs/2212.05562v2
- Date: Wed, 22 Mar 2023 15:06:30 GMT
- Title: Retire: Robust Expectile Regression in High Dimensions
- Authors: Rebeka Man, Kean Ming Tan, Zian Wang, and Wen-Xin Zhou
- Abstract summary: Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data.
We propose and study (penalized) robust expectile regression (retire)
We show that the proposed procedure can be efficiently solved by a semismooth Newton coordinate descent algorithm.
- Score: 3.9391041278203978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-dimensional data can often display heterogeneity due to heteroscedastic
variance or inhomogeneous covariate effects. Penalized quantile and expectile
regression methods offer useful tools to detect heteroscedasticity in
high-dimensional data. The former is computationally challenging due to the
non-smooth nature of the check loss, and the latter is sensitive to
heavy-tailed error distributions. In this paper, we propose and study
(penalized) robust expectile regression (retire), with a focus on iteratively
reweighted $\ell_1$-penalization which reduces the estimation bias from
$\ell_1$-penalization and leads to oracle properties. Theoretically, we
establish the statistical properties of the retire estimator under two regimes:
(i) low-dimensional regime in which $d \ll n$; (ii) high-dimensional regime in
which $s\ll n\ll d$ with $s$ denoting the number of significant predictors. In
the high-dimensional setting, we carefully characterize the solution path of
the iteratively reweighted $\ell_1$-penalized retire estimation, adapted from
the local linear approximation algorithm for folded-concave regularization.
Under a mild minimum signal strength condition, we show that after as many as
$\log(\log d)$ iterations the final iterate enjoys the oracle convergence rate.
At each iteration, the weighted $\ell_1$-penalized convex program can be
efficiently solved by a semismooth Newton coordinate descent algorithm.
Numerical studies demonstrate the competitive performance of the proposed
procedure compared with either non-robust or quantile regression based
alternatives.
Related papers
- Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Stochastic Optimization Algorithms for Instrumental Variable Regression with Streaming Data [17.657917523817243]
We develop and analyze algorithms for instrumental variable regression by viewing the problem as a conditional optimization problem.
In the context of least-squares instrumental variable regression, our algorithms neither require matrix inversions nor mini-batches.
We derive rates of convergence in expectation, that are of order $mathcalO(log T/T)$ and $mathcalO (1/T1-iota)$ for any $iota>0$.
arXiv Detail & Related papers (2024-05-29T19:21:55Z) - Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery [0.0]
We focus on distributed estimation and support recovery for high-dimensional linear quantile regression.
We transform the original quantile regression into the least-squares optimization.
An efficient algorithm is developed, which enjoys high computation and communication efficiency.
arXiv Detail & Related papers (2024-05-13T08:32:22Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - The Adaptive $τ$-Lasso: Robustness and Oracle Properties [12.06248959194646]
This paper introduces a new regularized version of the robust $tau$-regression estimator for analyzing high-dimensional datasets.
The resulting estimator, termed adaptive $tau$-Lasso, is robust to outliers and high-leverage points.
In the face of outliers and high-leverage points, the adaptive $tau$-Lasso and $tau$-Lasso estimators achieve the best performance or close-to-best performance.
arXiv Detail & Related papers (2023-04-18T21:34:14Z) - Near Optimal Private and Robust Linear Regression [47.2888113094367]
We propose a variant of the popular differentially private gradient descent (DP-SGD) algorithm with two innovations.
Under label-corruption, this is the first efficient linear regression algorithm to guarantee both $(varepsilon,delta)$-DP and robustness.
arXiv Detail & Related papers (2023-01-30T20:33:26Z) - Streaming Sparse Linear Regression [1.8707139489039097]
We propose a novel online sparse linear regression framework for analyzing streaming data when data points arrive sequentially.
Our proposed method is memory efficient and requires less stringent restricted strong convexity assumptions.
arXiv Detail & Related papers (2022-11-11T07:31:55Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z) - Conditional Uncorrelation and Efficient Non-approximate Subset Selection
in Sparse Regression [72.84177488527398]
We consider sparse regression from the view of correlation, and propose the formula of conditional uncorrelation.
By the proposed method, the computational complexity is reduced from $O(frac16k3+mk2+mkd)$ to $O(frac16k3+frac12mk2)$ for each candidate subset in sparse regression.
arXiv Detail & Related papers (2020-09-08T20:32:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.