Robust and Heavy-Tailed Mean Estimation Made Simple, via Regret
Minimization
- URL: http://arxiv.org/abs/2007.15839v2
- Date: Tue, 19 Jan 2021 00:27:26 GMT
- Title: Robust and Heavy-Tailed Mean Estimation Made Simple, via Regret
Minimization
- Authors: Samuel B. Hopkins, Jerry Li, Fred Zhang
- Abstract summary: We study the problem of estimating the mean of a distribution in high dimensions when either the samples are adversarially corrupted or the distribution is heavy-tailed.
We provide a meta-problem and a duality theorem that lead to a new unified view on robust and heavy-tailed mean estimation in high dimensions.
- Score: 21.64869023699999
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of estimating the mean of a distribution in high
dimensions when either the samples are adversarially corrupted or the
distribution is heavy-tailed. Recent developments in robust statistics have
established efficient and (near) optimal procedures for both settings. However,
the algorithms developed on each side tend to be sophisticated and do not
directly transfer to the other, with many of them having ad-hoc or complicated
analyses.
In this paper, we provide a meta-problem and a duality theorem that lead to a
new unified view on robust and heavy-tailed mean estimation in high dimensions.
We show that the meta-problem can be solved either by a variant of the Filter
algorithm from the recent literature on robust estimation or by the quantum
entropy scoring scheme (QUE), due to Dong, Hopkins and Li (NeurIPS '19). By
leveraging our duality theorem, these results translate into simple and
efficient algorithms for both robust and heavy-tailed settings. Furthermore,
the QUE-based procedure has run-time that matches the fastest known algorithms
on both fronts.
Our analysis of Filter is through the classic regret bound of the
multiplicative weights update method. This connection allows us to avoid the
technical complications in previous works and improve upon the run-time
analysis of a gradient-descent-based algorithm for robust mean estimation by
Cheng, Diakonikolas, Ge and Soltanolkotabi (ICML '20).
Related papers
- Overcoming Brittleness in Pareto-Optimal Learning-Augmented Algorithms [6.131022957085439]
We propose a new framework in which the smoothness in the performance of the algorithm is enforced by means of a user-specified profile.
We apply this new approach to a well-studied online problem, namely the one-way trading problem.
arXiv Detail & Related papers (2024-08-07T23:15:21Z) - Robust recovery for stochastic block models [16.74630355427558]
We develop an efficient algorithm for weak recovery in a robust version of the block model.
Our results show that there is no price of robustness in the block model.
arXiv Detail & Related papers (2021-11-16T15:43:00Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP.
DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize.
We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z) - Stochastic Optimization for Regularized Wasserstein Estimators [10.194798773447879]
We introduce an algorithm to solve a regularized version of the problem of Wasserstein estimators gradient, with a time per step which is sublinear in the natural dimensions.
We show that this algorithm can be extended to other tasks, including estimation of Wasserstein barycenters.
arXiv Detail & Related papers (2020-02-20T12:04:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.