General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean
Estimation
- URL: http://arxiv.org/abs/2301.13850v2
- Date: Wed, 20 Dec 2023 19:34:30 GMT
- Title: General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean
Estimation
- Authors: Aleksandar Nikolov and Haohua Tang
- Abstract summary: A classical approach to private mean estimation is to compute the true mean and add unbiased, but possibly correlated, Gaussian noise to it.
We show that for every input dataset, an unbiased mean estimator satisfying concentrated differential privacy introduces approximately at least as much error.
- Score: 58.03500081540042
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate unbiased high-dimensional mean estimators in differential
privacy. We consider differentially private mechanisms whose expected output
equals the mean of the input dataset, for every dataset drawn from a fixed
bounded $d$-dimensional domain $K$. A classical approach to private mean
estimation is to compute the true mean and add unbiased, but possibly
correlated, Gaussian noise to it. In the first part of this paper, we study the
optimal error achievable by a Gaussian noise mechanism for a given domain $K$
when the error is measured in the $\ell_p$ norm for some $p \ge 2$. We give
algorithms that compute the optimal covariance for the Gaussian noise for a
given $K$ under suitable assumptions, and prove a number of nice geometric
properties of the optimal error. These results generalize the theory of
factorization mechanisms from domains $K$ that are symmetric and finite (or,
equivalently, symmetric polytopes) to arbitrary bounded domains.
In the second part of the paper we show that Gaussian noise mechanisms
achieve nearly optimal error among all private unbiased mean estimation
mechanisms in a very strong sense. In particular, for every input dataset, an
unbiased mean estimator satisfying concentrated differential privacy introduces
approximately at least as much error as the best Gaussian noise mechanism. We
extend this result to local differential privacy, and to approximate
differential privacy, but for the latter the error lower bound holds either for
a dataset or for a neighboring dataset, and this relaxation is necessary.
Related papers
- Some Constructions of Private, Efficient, and Optimal $K$-Norm and Elliptic Gaussian Noise [54.34628844260993]
Differentially private computation often begins with a bound on some $d$-dimensional statistic's sensitivity.
For pure differential privacy, the $K$-norm mechanism can improve on this approach using a norm tailored to the statistic's sensitivity space.
This paper solves both problems for the simple statistics of sum, count, and vote.
arXiv Detail & Related papers (2023-09-27T17:09:36Z) - Private Covariance Approximation and Eigenvalue-Gap Bounds for Complex
Gaussian Perturbations [28.431572772564518]
We show that the Frobenius norm of the difference between the matrix output by this mechanism and the best rank-$k$ approximation to $M$ is bounded by roughly $tildeO(sqrtkd)$.
This improves on previous work that requires that the gap between every pair of top-$k$ eigenvalues of $M$ is at least $sqrtd$ for a similar bound.
arXiv Detail & Related papers (2023-06-29T03:18:53Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - Differentially private multivariate medians [4.588028371034407]
We develop novel finite-sample performance guarantees for differentially private depth-based medians.
We show that under Cauchy marginals, the cost of heavy-tailed location estimation outweighs the cost of privacy.
arXiv Detail & Related papers (2022-10-12T17:56:04Z) - Optimal Algorithms for Mean Estimation under Local Differential Privacy [55.32262879188817]
We show that PrivUnit achieves the optimal variance among a large family of locally private randomizers.
We also develop a new variant of PrivUnit based on the Gaussian distribution which is more amenable to mathematical analysis and enjoys the same optimality guarantees.
arXiv Detail & Related papers (2022-05-05T06:43:46Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Optimizing Information-theoretical Generalization Bounds via Anisotropic
Noise in SGLD [73.55632827932101]
We optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD.
We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance.
arXiv Detail & Related papers (2021-10-26T15:02:27Z) - Covariance-Aware Private Mean Estimation Without Private Covariance Estimation [10.036088581191592]
We present two sample-efficient differentially private mean estimators for $d$-dimensional (sub)Gaussian distributions.
Our estimators output $tildemu$ such that $| tildemu - mu |_Sigma leq alpha$, where $| cdot |_Sigma$ is the Mahalanobis distance.
arXiv Detail & Related papers (2021-06-24T21:40:07Z) - Strongly universally consistent nonparametric regression and
classification with privatised data [2.879036956042183]
We revisit the classical problem of nonparametric regression, but impose local differential privacy constraints.
We design a novel estimator of the regression function, which can be viewed as a privatised version of the well-studied partitioning regression estimator.
arXiv Detail & Related papers (2020-10-31T09:00:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.