List-Decodable Subspace Recovery: Dimension Independent Error in
Polynomial Time
- URL: http://arxiv.org/abs/2002.05139v3
- Date: Thu, 7 Jan 2021 17:54:57 GMT
- Title: List-Decodable Subspace Recovery: Dimension Independent Error in
Polynomial Time
- Authors: Ainesh Bakshi and Pravesh K. Kothari
- Abstract summary: In list-decodable subspace recovery, the input is a collection of $n$ points $alpha n$ (for some $alpha ll 1/2$) of which are drawn i.i.d. from a distribution $mathcalD$.
In this work, we improve on results on all three fronts: emphcertifiable anti-concentration error via a faster fixed-polynomial running time.
- Score: 5.812499828391904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In list-decodable subspace recovery, the input is a collection of $n$ points
$\alpha n$ (for some $\alpha \ll 1/2$) of which are drawn i.i.d. from a
distribution $\mathcal{D}$ with a isotropic rank $r$ covariance $\Pi_*$ (the
\emph{inliers}) and the rest are arbitrary, potential adversarial outliers. The
goal is to recover a $O(1/\alpha)$ size list of candidate covariances that
contains a $\hat{\Pi}$ close to $\Pi_*$. Two recent independent works
(Raghavendra-Yau, Bakshi-Kothari 2020) gave the first efficient algorithm for
this problem. These results, however, obtain an error that grows with the
dimension (linearly in [RY] and logarithmically in BK) at the cost of
quasi-polynomial running time) and rely on \emph{certifiable
anti-concentration} - a relatively strict condition satisfied essentially only
by the Gaussian distribution.
In this work, we improve on these results on all three fronts:
\emph{dimension-independent} error via a faster fixed-polynomial running time
under less restrictive distributional assumptions. Specifically, we give a
$poly(1/\alpha) d^{O(1)}$ time algorithm that outputs a list containing a
$\hat{\Pi}$ satisfying $\|\hat{\Pi} -\Pi_*\|_F \leq O(1/\alpha)$. Our result
only needs $\mathcal{D}$ to have \emph{certifiably hypercontractive} degree 2
polynomials. As a result, in addition to Gaussians, our algorithm applies to
the uniform distribution on the hypercube and $q$-ary cubes and arbitrary
product distributions with subgaussian marginals. Prior work (Raghavendra and
Yau, 2020) had identified such distributions as potential hard examples as such
distributions do not exhibit strong enough anti-concentration. When
$\mathcal{D}$ satisfies certifiable anti-concentration, we obtain a stronger
error guarantee of $\|\hat{\Pi}-\Pi_*\|_F \leq \eta$ for any arbitrary $\eta >
0$ in $d^{O(poly(1/\alpha) + \log (1/\eta))}$ time.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Robust Distribution Learning with Local and Global Adversarial Corruptions [17.22168727622332]
We develop an efficient finite-sample algorithm with error bounded by $sqrtvarepsilon k + rho + tildeO(dsqrtkn-1/(k lor 2))$ when $P$ has bounded covariance.
Our efficient procedure relies on a novel trace norm approximation of an ideal yet intractable 2-Wasserstein projection estimator.
arXiv Detail & Related papers (2024-06-10T17:48:36Z) - Estimation and Inference in Distributional Reinforcement Learning [28.253677740976197]
We show that a dataset of size $widetilde Oleft(frac|mathcalS||mathcalA|epsilon2 (1-gamma)4right)$ suffices to ensure the Kolmogorov metric and total variation metric between $hatetapi$ and $etapi$ is below $epsilon$ with high probability.
Our findings give rise to a unified approach to statistical inference of a wide class of statistical functionals of $etapi$.
arXiv Detail & Related papers (2023-09-29T14:14:53Z) - Near Sample-Optimal Reduction-based Policy Learning for Average Reward
MDP [58.13930707612128]
This work considers the sample complexity of obtaining an $varepsilon$-optimal policy in an average reward Markov Decision Process (AMDP)
We prove an upper bound of $widetilde O(H varepsilon-3 ln frac1delta)$ samples per state-action pair, where $H := sp(h*)$ is the span of bias of any optimal policy, $varepsilon$ is the accuracy and $delta$ is the failure probability.
arXiv Detail & Related papers (2022-12-01T15:57:58Z) - A spectral least-squares-type method for heavy-tailed corrupted
regression with unknown covariance \& heterogeneous noise [2.019622939313173]
We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted $n$-sized label-feature sample of at most $epsilon n$ arbitrary outliers.
We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on $(Sigma,Xi) nor the operator norm of $Xi$.
arXiv Detail & Related papers (2022-09-06T23:37:31Z) - List-Decodable Covariance Estimation [1.9290392443571387]
We give the first time algorithm for emphlist-decodable covariance estimation.
Our result implies the first-time emphexact algorithm for list-decodable linear regression and subspace recovery.
arXiv Detail & Related papers (2022-06-22T09:38:06Z) - Faster Sampling from Log-Concave Distributions over Polytopes via a
Soft-Threshold Dikin Walk [28.431572772564518]
We consider the problem of sampling from a $d$-dimensional log-concave distribution $pi(theta) propto e-f(theta)$ constrained to a polytope $K$ defined by $m$ inequalities.
Our main result is a "soft-warm'' variant of the Dikin walk Markov chain that requires at most $O((md + d L2 R2) times MDomega-1) log(fracwdelta)$ arithmetic operations to sample from $pi$
arXiv Detail & Related papers (2022-06-19T11:33:07Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist.
We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z) - Near-Optimal SQ Lower Bounds for Agnostically Learning Halfspaces and
ReLUs under Gaussian Marginals [49.60752558064027]
We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals.
Our lower bounds provide strong evidence that current upper bounds for these tasks are essentially best possible.
arXiv Detail & Related papers (2020-06-29T17:10:10Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.