Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages
- URL: http://arxiv.org/abs/2404.10201v2
- Date: Thu, 25 Apr 2024 05:09:49 GMT
- Title: Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages
- Authors: Hilal Asi, Vitaly Feldman, Jelani Nelson, Huy L. Nguyen, Kunal Talwar, Samson Zhou,
- Abstract summary: We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v(i) inmathbbRd$.
We propose a new multi-message protocol that achieves the optimal error using $tildemathcalOleft(min(nvarepsilon2,d)right)$ messages per user.
- Score: 63.366380571397
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v^{(i)} \in\mathbb{R}^d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right)$ messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each user to send $\Omega(\min(n\varepsilon^2,d)/\log(n))$ messages, demonstrating the optimality of our message complexity up to logarithmic factors. Additionally, we study the single-message setting and design a protocol that achieves mean squared error $\mathcal{O}(dn^{d/(d+2)}\varepsilon^{-4/(d+2)})$. Moreover, we show that any single-message protocol must incur mean squared error $\Omega(dn^{d/(d+2)})$, showing that our protocol is optimal in the standard setting where $\varepsilon = \Theta(1)$. Finally, we study robustness to malicious users and show that malicious users can incur large additive error with a single shuffler.
Related papers
- Instance-optimal Clipping for Summation Problems in the Shuffle Model of Differential Privacy [36.04370583654189]
We show how a clipping mechanism can achieve an instance-optimal error of $O(max_i x_i cdot loglog U /varepsilon)$.
We also extend our technique to the high-dimensional sum estimation problem and sparse vector aggregation.
arXiv Detail & Related papers (2024-03-15T09:04:00Z) - $\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model.
For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits.
For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z) - Privacy Amplification via Compression: Achieving the Optimal
Privacy-Accuracy-Communication Trade-off in Distributed Mean Estimation [20.909302074826666]
Privacy and communication constraints are two major bottlenecks in federated learning (FL) and analytics (FA)
We show that in order to achieve the optimal error under $(varepsilon, delta)$-DP, it is sufficient for each client to send $Thetaleft( n minleft(varepsilon, varepsilon2right)$ bits for FL and $Thetaleft(logleft(minleft(varepsilon, varepsilon2right)$)$ bits for FA
arXiv Detail & Related papers (2023-04-04T05:37:17Z) - Concurrent Shuffle Differential Privacy Under Continual Observation [60.127179363119936]
We study the private continual summation problem (a.k.a. the counter problem)
We show that the concurrent shuffle model allows for significantly improved error compared to a standard (single) shuffle model.
arXiv Detail & Related papers (2023-01-29T20:42:54Z) - Fast, Sample-Efficient, Affine-Invariant Private Mean and Covariance
Estimation for Subgaussian Distributions [8.40077201352607]
We present a fast, differentially private algorithm for high-dimensional covariance-aware mean estimation.
Our algorithm produces $tildemu$ such that $|mu|_Sigma leq alpha$ as long as $n gtrsim tfrac d alpha2 + tfracd sqrtlog 1/deltaalpha varepsilon+fracdlog 1/deltavarepsilon$.
arXiv Detail & Related papers (2023-01-28T16:57:46Z) - Near Sample-Optimal Reduction-based Policy Learning for Average Reward
MDP [58.13930707612128]
This work considers the sample complexity of obtaining an $varepsilon$-optimal policy in an average reward Markov Decision Process (AMDP)
We prove an upper bound of $widetilde O(H varepsilon-3 ln frac1delta)$ samples per state-action pair, where $H := sp(h*)$ is the span of bias of any optimal policy, $varepsilon$ is the accuracy and $delta$ is the failure probability.
arXiv Detail & Related papers (2022-12-01T15:57:58Z) - Tight Bounds for the Randomized and Quantum Communication Complexities
of Equality with Small Error [1.6522364074260811]
We investigate the randomized and quantum communication complexities of the Equality function with small error probability $epsilon$.
We show that any $log(n/epsilon)-loglog(sqrtn/epsilon)+3$ protocol communicates at least $log(n/epsilon)-log(sqrtn/epsilon)-O(1)$ qubits.
arXiv Detail & Related papers (2021-07-25T13:52:42Z) - On Distributed Differential Privacy and Counting Distinct Elements [52.701425652208734]
We study the setup where each of $n$ users holds an element from a discrete set.
The goal is to count the number of distinct elements across all users.
arXiv Detail & Related papers (2020-09-21T04:13:34Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.