Private Mean Estimation of Heavy-Tailed Distributions
- URL: http://arxiv.org/abs/2002.09464v3
- Date: Tue, 16 Feb 2021 17:06:17 GMT
- Title: Private Mean Estimation of Heavy-Tailed Distributions
- Authors: Gautam Kamath, Vikrant Singhal, Jonathan Ullman
- Abstract summary: We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded $k$-th moments.
We show that $n = Thetaleft(frac1alpha2 + frac1alphafrackk-1varepsilonright)$ samples are necessary and sufficient to estimate the mean to $alpha$-accuracy under $varepsilon$-differential privacy, or any of its common relaxations.
- Score: 10.176795938619417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We give new upper and lower bounds on the minimax sample complexity of
differentially private mean estimation of distributions with bounded $k$-th
moments. Roughly speaking, in the univariate case, we show that $n =
\Theta\left(\frac{1}{\alpha^2} +
\frac{1}{\alpha^{\frac{k}{k-1}}\varepsilon}\right)$ samples are necessary and
sufficient to estimate the mean to $\alpha$-accuracy under
$\varepsilon$-differential privacy, or any of its common relaxations. This
result demonstrates a qualitatively different behavior compared to estimation
absent privacy constraints, for which the sample complexity is identical for
all $k \geq 2$. We also give algorithms for the multivariate setting whose
sample complexity is a factor of $O(d)$ larger than the univariate case.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Statistical-Computational Trade-offs for Density Estimation [60.81548752871115]
We show that for a broad class of data structures their bounds cannot be significantly improved.
This is a novel emphstatistical-computational trade-off for density estimation.
arXiv Detail & Related papers (2024-10-30T15:03:33Z) - Private Mean Estimation with Person-Level Differential Privacy [6.621676316292624]
We study person-level differentially private mean estimation in the case where each person holds multiple samples.
We give computationally efficient algorithms under approximate-DP and computationally inefficient algorithms under pure DP, and our nearly matching lower bounds hold for the most permissive case of approximate DP.
arXiv Detail & Related papers (2024-05-30T18:20:35Z) - The Sample Complexity of Simple Binary Hypothesis Testing [7.127829790714167]
The sample complexity of simple binary hypothesis testing is the smallest number of i.i.d. samples required to distinguish between two distributions $p$ and $q$ in either setting.
This problem has only been studied when $alpha = beta$ (prior-free) or $alpha = 1/2$ (Bayesian)
arXiv Detail & Related papers (2024-03-25T17:42:32Z) - Better and Simpler Lower Bounds for Differentially Private Statistical
Estimation [7.693388437377614]
We prove that for any $alpha le O(1)$, estimating the covariance of a Gaussian up to spectral error $alpha$ requires $tildeOmegaleft(fracd3/2alpha varepsilon + fracdalpha2right)$ samples.
Next, we prove that estimating the mean of a heavy-tailed distribution with bounded $k$th moments requires $tildeOmegaleft(fracdalphak/(k-1) varepsilon +
arXiv Detail & Related papers (2023-10-10T04:02:43Z) - Stochastic Approximation Approaches to Group Distributionally Robust
Optimization [96.26317627118912]
Group distributionally robust optimization (GDRO)
Online learning techniques to reduce the number of samples required in each round from $m$ to $1$, keeping the same sample.
A novel formulation of weighted GDRO, which allows us to derive distribution-dependent convergence rates.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Covariance-Aware Private Mean Estimation Without Private Covariance Estimation [10.036088581191592]
We present two sample-efficient differentially private mean estimators for $d$-dimensional (sub)Gaussian distributions.
Our estimators output $tildemu$ such that $| tildemu - mu |_Sigma leq alpha$, where $| cdot |_Sigma$ is the Mahalanobis distance.
arXiv Detail & Related papers (2021-06-24T21:40:07Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.