Related papers: Robust Mean Estimation Without Moments for Symmetric Distributions

Robust Mean Estimation Without Moments for Symmetric Distributions

URL: http://arxiv.org/abs/2302.10844v2
Date: Wed, 8 Nov 2023 18:49:42 GMT
Title: Robust Mean Estimation Without Moments for Symmetric Distributions
Authors: Gleb Novikov, David Steurer, Stefan Tiegel
Abstract summary: We show that for a large class of symmetric distributions, the same error as in the Gaussian setting can be achieved efficiently. We propose a sequence of efficient algorithms that approaches this optimal error. Our algorithms are based on a generalization of the well-known filtering technique.
Score: 7.105512316884493
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of robustly estimating the mean or location parameter without moment assumptions. We show that for a large class of symmetric distributions, the same error as in the Gaussian setting can be achieved efficiently. The distributions we study include products of arbitrary symmetric one-dimensional distributions, such as product Cauchy distributions, as well as elliptical distributions. For product distributions and elliptical distributions with known scatter (covariance) matrix, we show that given an $\varepsilon$-corrupted sample, we can with probability at least $1-\delta$ estimate its location up to error $O(\varepsilon \sqrt{\log(1/\varepsilon)})$ using $\tfrac{d\log(d) + \log(1/\delta)}{\varepsilon^2 \log(1/\varepsilon)}$ samples. This result matches the best-known guarantees for the Gaussian distribution and known SQ lower bounds (up to the $\log(d)$ factor). For elliptical distributions with unknown scatter (covariance) matrix, we propose a sequence of efficient algorithms that approaches this optimal error. Specifically, for every $k \in \mathbb{N}$, we design an estimator using time and samples $\tilde{O}({d^k})$ achieving error $O(\varepsilon^{1-\frac{1}{2k}})$. This matches the error and running time guarantees when assuming certifiably bounded moments of order up to $k$. For unknown covariance, such error bounds of $o(\sqrt{\varepsilon})$ are not even known for (general) sub-Gaussian distributions. Our algorithms are based on a generalization of the well-known filtering technique. We show how this machinery can be combined with Huber-loss-based techniques to work with projections of the noise that behave more nicely than the initial noise. Moreover, we show how SoS proofs can be used to obtain algorithmic guarantees even for distributions without a first moment. We believe that this approach may find other applications in future works.

Related papers

Nearly Optimal Robust Covariance and Scatter Matrix Estimation Beyond Gaussians [2.311583680973075]
We study the problem of computationally efficient robust estimation of the covariance/scatter matrix of elliptical distributions. We obtain the first efficiently computable, nearly optimal robust covariance estimators that extend beyond the Gaussian case.
arXiv Detail & Related papers (2025-02-10T15:31:57Z)
On the query complexity of sampling from non-log-concave distributions [2.4253233571593547]
We study the problem of sampling from a $d$-dimensional distribution with density $p(x)propto e-f(x)$, which does not necessarily satisfy good isoperimetric conditions.<n>We show that for a wide range of parameters, sampling is strictly easier than optimization by a super-exponential factor in the dimension $d$.
arXiv Detail & Related papers (2025-02-10T06:54:16Z)
Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality. We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z)
Robust Sparse Regression with Non-Isotropic Designs [4.964650614497048]
We develop a technique to design efficiently computable estimators for sparse linear regression in the simultaneous presence of two adversaries. We provide a novel analysis of weighted penalized Huber loss that is suitable for heavy-tailed designs in the presence of two adversaries.
arXiv Detail & Related papers (2024-10-31T13:51:59Z)
Sum-of-squares lower bounds for Non-Gaussian Component Analysis [33.80749804695003]
Non-Gaussian Component Analysis (NGCA) is the statistical task of finding a non-Gaussian direction in a high-dimensional dataset. Here we study the complexity of NGCA in the Sum-of-Squares framework.
arXiv Detail & Related papers (2024-10-28T18:19:13Z)
$L^1$ Estimation: On the Optimality of Linear Estimators [64.76492306585168]
This work shows that the only prior distribution on $X$ that induces linearity in the conditional median is Gaussian. In particular, it is demonstrated that if the conditional distribution $P_X|Y=y$ is symmetric for all $y$, then $X$ must follow a Gaussian distribution.
arXiv Detail & Related papers (2023-09-17T01:45:13Z)
Fast, Sample-Efficient, Affine-Invariant Private Mean and Covariance Estimation for Subgaussian Distributions [8.40077201352607]
We present a fast, differentially private algorithm for high-dimensional covariance-aware mean estimation. Our algorithm produces $tildemu$ such that $|mu|_Sigma leq alpha$ as long as $n gtrsim tfrac d alpha2 + tfracd sqrtlog 1/deltaalpha varepsilon+fracdlog 1/deltavarepsilon$.
arXiv Detail & Related papers (2023-01-28T16:57:46Z)
Robust Sparse Mean Estimation via Sum of Squares [42.526664955704746]
We study the problem of high-dimensional sparse mean estimation in the presence of an $epsilon$-fraction of adversarial outliers. Our algorithms follow the Sum-of-Squares based, to algorithms approach.
arXiv Detail & Related papers (2022-06-07T16:49:54Z)
Optimal Sublinear Sampling of Spanning Trees and Determinantal Point Processes via Average-Case Entropic Independence [3.9586758145580014]
We design fast algorithms for repeatedly sampling from strongly Rayleigh distributions. For a graph $G=(V, E)$, we show how to approximately sample uniformly random spanning trees from $G$ in $widetildeO(lvert Vrvert)$ time per sample. For a determinantal point process on subsets of size $k$ of a ground set of $n$ elements, we show how to approximately sample in $widetildeO(komega)$ time after an initial $widetildeO(nk
arXiv Detail & Related papers (2022-04-06T04:11:26Z)
Random quantum circuits transform local noise into global white noise [118.18170052022323]
We study the distribution over measurement outcomes of noisy random quantum circuits in the low-fidelity regime. For local noise that is sufficiently weak and unital, correlations (measured by the linear cross-entropy benchmark) between the output distribution $p_textnoisy$ of a generic noisy circuit instance shrink exponentially. If the noise is incoherent, the output distribution approaches the uniform distribution $p_textunif$ at precisely the same rate.
arXiv Detail & Related papers (2021-11-29T19:26:28Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$. We propose a novel random technique, called Ternary Random Feature (TRF) The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z)
Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist. We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector. We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.