Non-Asymptotic Guarantees for Robust Statistical Learning under
$(1+\varepsilon)$-th Moment Assumption
- URL: http://arxiv.org/abs/2201.03182v1
- Date: Mon, 10 Jan 2022 06:22:30 GMT
- Title: Non-Asymptotic Guarantees for Robust Statistical Learning under
$(1+\varepsilon)$-th Moment Assumption
- Authors: Lihu Xu, Fang Yao, Qiuran Yao, Huiming Zhang
- Abstract summary: This paper proposes a log-truncated M-mestiator for a large family of statistical regressions.
We show the superiority of log-truncated estimations over standard estimations.
- Score: 0.716879432974126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been a surge of interest in developing robust estimators for models
with heavy-tailed data in statistics and machine learning. This paper proposes
a log-truncated M-estimator for a large family of statistical regressions and
establishes its excess risk bound under the condition that the data have
$(1+\varepsilon)$-th moment with $\varepsilon \in (0,1]$. With an additional
assumption on the associated risk function, we obtain an $\ell_2$-error bound
for the estimation. Our theorems are applied to establish robust M-estimators
for concrete regressions. Besides convex regressions such as quantile
regression and generalized linear models, many non-convex regressions can also
be fit into our theorems, we focus on robust deep neural network regressions,
which can be solved by the stochastic gradient descent algorithms. Simulations
and real data analysis demonstrate the superiority of log-truncated estimations
over standard estimations.
Related papers
- Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression [29.57766164934947]
We investigate popular resampling methods for estimating the uncertainty of statistical models.
We provide a tight description of the biases and variances estimated by these methods in the context of generalized linear models.
arXiv Detail & Related papers (2024-02-21T08:50:33Z) - Efficient Truncated Linear Regression with Unknown Noise Variance [26.870279729431328]
We provide the first computationally and statistically efficient estimators for truncated linear regression when the noise variance is unknown.
Our estimator is based on an efficient implementation of Projected Gradient Descent on the negative-likelihood of the truncated sample.
arXiv Detail & Related papers (2022-08-25T12:17:37Z) - $p$-Generalized Probit Regression and Scalable Maximum Likelihood
Estimation via Sketching and Coresets [74.37849422071206]
We study the $p$-generalized probit regression model, which is a generalized linear model for binary responses.
We show how the maximum likelihood estimator for $p$-generalized probit regression can be approximated efficiently up to a factor of $(1+varepsilon)$ on large data.
arXiv Detail & Related papers (2022-03-25T10:54:41Z) - Stability and Risk Bounds of Iterative Hard Thresholding [41.082982732100696]
We introduce a novel sparse generalization theory for IHT under the notion of algorithmic stability.
We show that IHT with sparsity level $k$ enjoys an $mathcaltilde O(n-1/2sqrtlog(n)log(p))$ rate of convergence in sparse excess risk.
Preliminary numerical evidence is provided to confirm our theoretical predictions.
arXiv Detail & Related papers (2022-03-17T16:12:56Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Understanding the Under-Coverage Bias in Uncertainty Estimation [58.03725169462616]
quantile regression tends to emphunder-cover than the desired coverage level in reality.
We prove that quantile regression suffers from an inherent under-coverage bias.
Our theory reveals that this under-coverage bias stems from a certain high-dimensional parameter estimation error.
arXiv Detail & Related papers (2021-06-10T06:11:55Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z) - Robust Geodesic Regression [6.827783641211451]
We use M-type estimators, including the $L_1$, Huber and Tukey biweight estimators, to perform robust geodesic regression.
Results from numerical examples, including analysis of real neuroimaging data, demonstrate the promising empirical properties of the proposed approach.
arXiv Detail & Related papers (2020-07-09T02:41:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.