Sum-of-squares lower bounds for Non-Gaussian Component Analysis
- URL: http://arxiv.org/abs/2410.21426v1
- Date: Mon, 28 Oct 2024 18:19:13 GMT
- Title: Sum-of-squares lower bounds for Non-Gaussian Component Analysis
- Authors: Ilias Diakonikolas, Sushrut Karmalkar, Shuo Pang, Aaron Potechin,
- Abstract summary: Non-Gaussian Component Analysis (NGCA) is the statistical task of finding a non-Gaussian direction in a high-dimensional dataset.
Here we study the complexity of NGCA in the Sum-of-Squares framework.
- Score: 33.80749804695003
- License:
- Abstract: Non-Gaussian Component Analysis (NGCA) is the statistical task of finding a non-Gaussian direction in a high-dimensional dataset. Specifically, given i.i.d.\ samples from a distribution $P^A_{v}$ on $\mathbb{R}^n$ that behaves like a known distribution $A$ in a hidden direction $v$ and like a standard Gaussian in the orthogonal complement, the goal is to approximate the hidden direction. The standard formulation posits that the first $k-1$ moments of $A$ match those of the standard Gaussian and the $k$-th moment differs. Under mild assumptions, this problem has sample complexity $O(n)$. On the other hand, all known efficient algorithms require $\Omega(n^{k/2})$ samples. Prior work developed sharp Statistical Query and low-degree testing lower bounds suggesting an information-computation tradeoff for this problem. Here we study the complexity of NGCA in the Sum-of-Squares (SoS) framework. Our main contribution is the first super-constant degree SoS lower bound for NGCA. Specifically, we show that if the non-Gaussian distribution $A$ matches the first $(k-1)$ moments of $\mathcal{N}(0, 1)$ and satisfies other mild conditions, then with fewer than $n^{(1 - \varepsilon)k/2}$ many samples from the normal distribution, with high probability, degree $(\log n)^{{1\over 2}-o_n(1)}$ SoS fails to refute the existence of such a direction $v$. Our result significantly strengthens prior work by establishing a super-polynomial information-computation tradeoff against a broader family of algorithms. As corollaries, we obtain SoS lower bounds for several problems in robust statistics and the learning of mixture models. Our SoS lower bound proof introduces a novel technique, that we believe may be of broader interest, and a number of refinements over existing methods.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Identification of Mixtures of Discrete Product Distributions in
Near-Optimal Sample and Time Complexity [6.812247730094931]
We show, for any $ngeq 2k-1$, how to achieve sample complexity and run-time complexity $(1/zeta)O(k)$.
We also extend the known lower bound of $eOmega(k)$ to match our upper bound across a broad range of $zeta$.
arXiv Detail & Related papers (2023-09-25T09:50:15Z) - A Spectral Algorithm for List-Decodable Covariance Estimation in
Relative Frobenius Norm [41.03423042792559]
We produce a list of hypotheses that are close to $Sigma$ in relative Frobenius norm.
As a corollary, we obtain an efficient spectral algorithm for robust partial clustering of Gaussian mixture models.
Our new method yields the first Sum-of-Squares-free algorithm for robustly learning GMMs.
arXiv Detail & Related papers (2023-05-01T17:54:35Z) - Stochastic Approximation Approaches to Group Distributionally Robust
Optimization [96.26317627118912]
Group distributionally robust optimization (GDRO)
Online learning techniques to reduce the number of samples required in each round from $m$ to $1$, keeping the same sample.
A novel formulation of weighted GDRO, which allows us to derive distribution-dependent convergence rates.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Gaussian Mean Testing Made Simple [46.03021473600576]
Given i.i.d. samples from a distribution $p$ on $mathbbRd$, the task is to distinguish, with high probability, between the following cases.
We give an extremely simple algorithm for Gaussian mean testing with a one-page analysis.
arXiv Detail & Related papers (2022-10-25T01:59:13Z) - Robust Sparse Mean Estimation via Sum of Squares [42.526664955704746]
We study the problem of high-dimensional sparse mean estimation in the presence of an $epsilon$-fraction of adversarial outliers.
Our algorithms follow the Sum-of-Squares based, to algorithms approach.
arXiv Detail & Related papers (2022-06-07T16:49:54Z) - Non-Gaussian Component Analysis via Lattice Basis Reduction [56.98280399449707]
Non-Gaussian Component Analysis (NGCA) is a distribution learning problem.
We provide an efficient algorithm for NGCA in the regime that $A$ is discrete or nearly discrete.
arXiv Detail & Related papers (2021-12-16T18:38:02Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.