Robust Linear Regression: Optimal Rates in Polynomial Time
- URL: http://arxiv.org/abs/2007.01394v4
- Date: Fri, 4 Dec 2020 15:18:28 GMT
- Title: Robust Linear Regression: Optimal Rates in Polynomial Time
- Authors: Ainesh Bakshi and Adarsh Prasad
- Abstract summary: We obtain robust and computationally efficient estimators for learning several linear models.
We identify an analytic condition that serves as a relaxation of independence of random variables.
Our central technical contribution is to algorithmically exploit independence of random variables in the "sum-of-squares" framework.
- Score: 11.646151402884215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We obtain robust and computationally efficient estimators for learning
several linear models that achieve statistically optimal convergence rate under
minimal distributional assumptions. Concretely, we assume our data is drawn
from a $k$-hypercontractive distribution and an $\epsilon$-fraction is
adversarially corrupted. We then describe an estimator that converges to the
optimal least-squares minimizer for the true distribution at a rate
proportional to $\epsilon^{2-2/k}$, when the noise is independent of the
covariates. We note that no such estimator was known prior to our work, even
with access to unbounded computation. The rate we achieve is
information-theoretically optimal and thus we resolve the main open question in
Klivans, Kothari and Meka [COLT'18].
Our key insight is to identify an analytic condition that serves as a
polynomial relaxation of independence of random variables. In particular, we
show that when the moments of the noise and covariates are
negatively-correlated, we obtain the same rate as independent noise. Further,
when the condition is not satisfied, we obtain a rate proportional to
$\epsilon^{2-4/k}$, and again match the information-theoretic lower bound. Our
central technical contribution is to algorithmically exploit independence of
random variables in the "sum-of-squares" framework by formulating it as the
aforementioned polynomial inequality.
Related papers
- Optimal minimax rate of learning interaction kernels [7.329333373512536]
We introduce a tamed least squares estimator (tLSE) that attains the optimal convergence rate for a broad class of exchangeable distributions.
Our findings reveal that provided the inverse problem in the large sample limit satisfies a coercivity condition, the left tail probability does not alter the bias-variance tradeoff.
arXiv Detail & Related papers (2023-11-28T15:01:58Z) - High-probability Convergence Bounds for Nonlinear Stochastic Gradient Descent Under Heavy-tailed Noise [59.25598762373543]
We show that wetailed high-prob convergence guarantees of learning on streaming data in the presence of heavy-tailed noise.
We demonstrate analytically and that $ta$ can be used to the preferred choice of setting for a given problem.
arXiv Detail & Related papers (2023-10-28T18:53:41Z) - Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear
Models [4.010566541114989]
We propose a notion of simultaneous confidence intervals called the sparsified simultaneous confidence intervals.
Our intervals are sparse in the sense that some of the intervals' upper and lower bounds are shrunken to zero.
The proposed method can be coupled with various selection procedures, making it ideal for comparing their uncertainty.
arXiv Detail & Related papers (2023-07-14T18:37:57Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Improved Analysis of Score-based Generative Modeling: User-Friendly
Bounds under Minimal Smoothness Assumptions [9.953088581242845]
We provide convergence guarantees with complexity for any data distribution with second-order moment.
Our result does not rely on any log-concavity or functional inequality assumption.
Our theoretical analysis provides comparison between different discrete approximations and may guide the choice of discretization points in practice.
arXiv Detail & Related papers (2022-11-03T15:51:00Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Distribution-Free Robust Linear Regression [5.532477732693]
We study random design linear regression with no assumptions on the distribution of the covariates.
We construct a non-linear estimator achieving excess risk of order $d/n$ with the optimal sub-exponential tail.
We prove an optimal version of the classical bound for the truncated least squares estimator due to Gy"orfi, Kohler, Krzyzak, and Walk.
arXiv Detail & Related papers (2021-02-25T15:10:41Z) - Minimax Off-Policy Evaluation for Multi-Armed Bandits [58.7013651350436]
We study the problem of off-policy evaluation in the multi-armed bandit model with bounded rewards.
We develop minimax rate-optimal procedures under three settings.
arXiv Detail & Related papers (2021-01-19T18:55:29Z) - Suboptimality of Constrained Least Squares and Improvements via
Non-Linear Predictors [3.5788754401889014]
We study the problem of predicting as well as the best linear predictor in a bounded Euclidean ball with respect to the squared loss.
We discuss additional distributional assumptions sufficient to guarantee an $O(d/n)$ excess risk rate for the least squares estimator.
arXiv Detail & Related papers (2020-09-19T21:39:46Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.