On the well-spread property and its relation to linear regression
- URL: http://arxiv.org/abs/2206.08092v1
- Date: Thu, 16 Jun 2022 11:17:44 GMT
- Title: On the well-spread property and its relation to linear regression
- Authors: Hongjie Chen, Tommaso d'Orsi
- Abstract summary: We show that consistent recovery of the parameter vector in a robust linear regression model is information-theoretically impossible.
We show that it is possible to efficiently certify whether a given $n$-by-$d$ matrix is well-spread if the number of observations is quadratic in the ambient dimension.
- Score: 4.619541348328937
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the robust linear regression model $\boldsymbol{y} = X\beta^* +
\boldsymbol{\eta}$, where an adversary oblivious to the design $X \in
\mathbb{R}^{n \times d}$ may choose $\boldsymbol{\eta}$ to corrupt all but a
(possibly vanishing) fraction of the observations $\boldsymbol{y}$ in an
arbitrary way. Recent work [dLN+21, dNS21] has introduced efficient algorithms
for consistent recovery of the parameter vector. These algorithms crucially
rely on the design matrix being well-spread (a matrix is well-spread if its
column span is far from any sparse vector).
In this paper, we show that there exists a family of design matrices lacking
well-spreadness such that consistent recovery of the parameter vector in the
above robust linear regression model is information-theoretically impossible.
We further investigate the average-case time complexity of certifying
well-spreadness of random matrices. We show that it is possible to efficiently
certify whether a given $n$-by-$d$ Gaussian matrix is well-spread if the number
of observations is quadratic in the ambient dimension. We complement this
result by showing rigorous evidence -- in the form of a lower bound against
low-degree polynomials -- of the computational hardness of this same
certification problem when the number of observations is $o(d^2)$.
Related papers
- Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Sparse Linear Regression and Lattice Problems [9.50127504736299]
We give evidence of average-case hardness of SLR w.r.t. all efficient algorithms assuming the worst-case hardness of lattice problems.
Specifically, we give an instance-by-instance reduction from a variant of the bounded distance decoding (BDD) problem on lattices to SLR.
arXiv Detail & Related papers (2024-02-22T15:45:27Z) - One-sided Matrix Completion from Two Observations Per Row [95.87811229292056]
We propose a natural algorithm that involves imputing the missing values of the matrix $XTX$.
We evaluate our algorithm on one-sided recovery of synthetic data and low-coverage genome sequencing.
arXiv Detail & Related papers (2023-06-06T22:35:16Z) - Perturbation Analysis of Randomized SVD and its Applications to
High-dimensional Statistics [8.90202564665576]
We study the statistical properties of RSVD under a general "signal-plus-noise" framework.
We derive nearly-optimal performance guarantees for RSVD when applied to three statistical inference problems.
arXiv Detail & Related papers (2022-03-19T07:26:45Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Non-PSD Matrix Sketching with Applications to Regression and
Optimization [56.730993511802865]
We present dimensionality reduction methods for non-PSD and square-roots" matrices.
We show how these techniques can be used for multiple downstream tasks.
arXiv Detail & Related papers (2021-06-16T04:07:48Z) - A Precise Performance Analysis of Support Vector Regression [105.94855998235232]
We study the hard and soft support vector regression techniques applied to a set of $n$ linear measurements.
Our results are then used to optimally tune the parameters intervening in the design of hard and soft support vector regression algorithms.
arXiv Detail & Related papers (2021-05-21T14:26:28Z) - Compressed sensing of low-rank plus sparse matrices [3.8073142980733]
This manuscript develops similar guarantees showing that $mtimes n$ that can be expressed as the sum of a rank-rparse matrix and a $s-sparse matrix can be recovered by computationally tractable methods.
Results are shown for synthetic problems, dynamic-foreground/static separation, multispectral imaging, and Robust PCA.
arXiv Detail & Related papers (2020-07-18T15:36:11Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Asymptotic errors for convex penalized linear regression beyond Gaussian
matrices [23.15629681360836]
We consider the problem of learning a coefficient vector $x_0$ in $RN$ from noisy linear observations.
We provide a rigorous derivation of an explicit formula for the mean squared error.
We show that our predictions agree remarkably well with numerics even for very moderate sizes.
arXiv Detail & Related papers (2020-02-11T13:43:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.