Robust Scatter Matrix Estimation for Elliptical Distributions in Polynomial Time
- URL: http://arxiv.org/abs/2502.06564v1
- Date: Mon, 10 Feb 2025 15:31:57 GMT
- Title: Robust Scatter Matrix Estimation for Elliptical Distributions in Polynomial Time
- Authors: Gleb Novikov,
- Abstract summary: We design time algorithms that achieve dimension-independent error in Frobenius norm.<n>Under a mild assumption on the eigenvalues of the scatter matrix $Sigma$, for every $t in mathbbN$, we design an estimator that, given $n = dO(t)$ samples, in time $nO(t)$ finds $hatSigma.
- Score: 2.311583680973075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of computationally efficient robust estimation of scatter matrices of elliptical distributions under the strong contamination model. We design polynomial time algorithms that achieve dimension-independent error in Frobenius norm. Our first result is a sequence of efficient algorithms that approaches nearly optimal error. Specifically, under a mild assumption on the eigenvalues of the scatter matrix $\Sigma$, for every $t \in \mathbb{N}$, we design an estimator that, given $n = d^{O(t)}$ samples, in time $n^{O(t)}$ finds $\hat{\Sigma}$ such that $ \Vert{\Sigma^{-1/2}\, ({\hat{\Sigma} - \Sigma})\, \Sigma^{-1/2}}\Vert_{\text{F}} \le O(t \cdot \varepsilon^{1-\frac{1}{t}})$, where $\varepsilon$ is the fraction of corruption. We do not require any assumptions on the moments of the distribution, while all previously known computationally efficient algorithms for robust covariance/scatter estimation with dimension-independent error rely on strong assumptions on the moments, such as sub-Gaussianity or (certifiable) hypercontractivity. Furthermore, under a stronger assumption on the eigenvalues of $\Sigma$ (that, in particular, is satisfied by all matrices with constant condition number), we provide a fast (sub-quadratic in the input size) algorithm that, given nearly optimal number of samples $n = \tilde{O}(d^2/\varepsilon)$, in time $\tilde{O}({nd^2 poly(1/\varepsilon)})$ finds $\hat{\Sigma}$ such that $\Vert\hat{\Sigma} - \Sigma\Vert_{\text{F}} \le O(\Vert{\Sigma}\Vert \cdot \sqrt{\varepsilon})$. Our approach is based on robust covariance estimation of the spatial sign (the projection onto the sphere of radius $\sqrt{d}$) of elliptical distributions.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Which exceptional low-dimensional projections of a Gaussian point cloud can be found in polynomial time? [8.74634652691576]
We study the subset $mathscrF_m,alpha$ of distributions that can be realized by a class of iterative algorithms.
Non-rigorous methods from statistical physics yield an indirect characterization of $mathscrF_m,alpha$ in terms of a generalized Parisi formula.
arXiv Detail & Related papers (2024-06-05T05:54:56Z) - A Sub-Quadratic Time Algorithm for Robust Sparse Mean Estimation [6.853165736531941]
We study the algorithmic problem of sparse mean estimation in the presence of adversarial outliers.
Our main contribution is an algorithm for robust sparse mean estimation which runs in emphsubquadratic time using $mathrmpoly(k,log d,1/epsilon)$ samples.
arXiv Detail & Related papers (2024-03-07T18:23:51Z) - Efficient Sampling of Stochastic Differential Equations with Positive
Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix.
It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$
Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z) - Robust Mean Estimation Without Moments for Symmetric Distributions [7.105512316884493]
We show that for a large class of symmetric distributions, the same error as in the Gaussian setting can be achieved efficiently.
We propose a sequence of efficient algorithms that approaches this optimal error.
Our algorithms are based on a generalization of the well-known filtering technique.
arXiv Detail & Related papers (2023-02-21T17:52:23Z) - Fast, Sample-Efficient, Affine-Invariant Private Mean and Covariance
Estimation for Subgaussian Distributions [8.40077201352607]
We present a fast, differentially private algorithm for high-dimensional covariance-aware mean estimation.
Our algorithm produces $tildemu$ such that $|mu|_Sigma leq alpha$ as long as $n gtrsim tfrac d alpha2 + tfracd sqrtlog 1/deltaalpha varepsilon+fracdlog 1/deltavarepsilon$.
arXiv Detail & Related papers (2023-01-28T16:57:46Z) - Statistically Optimal Robust Mean and Covariance Estimation for
Anisotropic Gaussians [3.5788754401889014]
In the strong $varepsilon$-contamination model we assume that the adversary replaced an $varepsilon$ fraction of vectors in the original Gaussian sample by any other vectors.
We construct an estimator $widehat Sigma of the cofrac matrix $Sigma that satisfies, with probability at least $1 - delta.
arXiv Detail & Related papers (2023-01-21T23:28:55Z) - A Fast Algorithm for Adaptive Private Mean Estimation [5.090363690988394]
We design an $(varepsilon, delta)$-differentially private algorithm that is adaptive to $Sigma$.
The estimator achieves optimal rates of convergence with respect to the induced Mahalanobis norm $||cdot||_Sigma$.
arXiv Detail & Related papers (2023-01-17T18:44:41Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Robust Sparse Mean Estimation via Sum of Squares [42.526664955704746]
We study the problem of high-dimensional sparse mean estimation in the presence of an $epsilon$-fraction of adversarial outliers.
Our algorithms follow the Sum-of-Squares based, to algorithms approach.
arXiv Detail & Related papers (2022-06-07T16:49:54Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication
Time [14.990725929840892]
We show an algorithm that runs in time $widetildeO(T(N, d) log kappa / mathrmpoly (varepsilon))$, where $T(N, d)$ is the time it takes to multiply a $d times N$ matrix by its transpose.
Our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors.
arXiv Detail & Related papers (2020-06-23T20:21:27Z) - Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector.
We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.