Outlier Robust Mean Estimation with Subgaussian Rates via Stability
- URL: http://arxiv.org/abs/2007.15618v2
- Date: Tue, 16 Mar 2021 15:58:25 GMT
- Title: Outlier Robust Mean Estimation with Subgaussian Rates via Stability
- Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia
- Abstract summary: We study the problem of robust outlier high-dimensional mean estimation.
We obtain first computationally efficient rate with subgaussian for outlier-robust mean estimation.
- Score: 46.03021473600576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of outlier robust high-dimensional mean estimation under
a finite covariance assumption, and more broadly under finite low-degree moment
assumptions. We consider a standard stability condition from the recent robust
statistics literature and prove that, except with exponentially small failure
probability, there exists a large fraction of the inliers satisfying this
condition. As a corollary, it follows that a number of recently developed
algorithms for robust mean estimation, including iterative filtering and
non-convex gradient descent, give optimal error estimators with
(near-)subgaussian rates. Previous analyses of these algorithms gave
significantly suboptimal rates. As a corollary of our approach, we obtain the
first computationally efficient algorithm with subgaussian rate for
outlier-robust mean estimation in the strong contamination model under a finite
covariance assumption.
Related papers
- Private Robust Estimation by Stabilizing Convex Relaxations [22.513117502159922]
$(epsilon, delta)$-differentially private (DP)
$(epsilon, delta)$-differentially private (DP)
$(epsilon, delta)$-differentially private (DP)
arXiv Detail & Related papers (2021-12-07T07:47:37Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Non-asymptotic analysis and inference for an outlyingness induced
winsorized mean [0.0]
This article investigates the robustness of leading sub-Gaussian estimators of mean.
It reveals that none of them can resist greater than $25%$ contamination in data.
It also introduces an outlyingness induced winsorized mean which has the best possible robustness.
arXiv Detail & Related papers (2021-05-05T21:35:24Z) - Robust regression with covariate filtering: Heavy tails and adversarial
contamination [6.939768185086755]
We show how to modify the Huber regression, least trimmed squares, and least absolute deviation estimators to obtain estimators simultaneously computationally and statistically efficient in the stronger contamination model.
We show that the Huber regression estimator achieves near-optimal error rates in this setting, whereas the least trimmed squares and least absolute deviation estimators can be made to achieve near-optimal error after applying a postprocessing step.
arXiv Detail & Related papers (2020-09-27T22:48:48Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - High-Dimensional Robust Mean Estimation via Gradient Descent [73.61354272612752]
We show that the problem of robust mean estimation in the presence of a constant adversarial fraction can be solved by gradient descent.
Our work establishes an intriguing connection between the near non-lemma estimation and robust statistics.
arXiv Detail & Related papers (2020-05-04T10:48:04Z) - Robust subgaussian estimation with VC-dimension [0.0]
This work proposes a new general way to bound the excess risk for MOM estimators.
The core technique is the use of VC-dimension (instead of Rademacher complexity) to measure the statistical complexity.
arXiv Detail & Related papers (2020-04-24T13:21:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.