Almost Instance-optimal Clipping for Summation Problems in the Shuffle Model of Differential Privacy
- URL: http://arxiv.org/abs/2403.10116v2
- Date: Fri, 30 Aug 2024 15:16:26 GMT
- Title: Almost Instance-optimal Clipping for Summation Problems in the Shuffle Model of Differential Privacy
- Authors: Wei Dong, Qiyao Luo, Giulia Fanti, Elaine Shi, Ke Yi,
- Abstract summary: We show how the clipping mechanism can achieve an instance-optimal error of $O(max_i x_i cdot loglog U /varepsilon)$.
We also extend our technique to the high-dimensional sum estimation problem and sparse vector aggregation.
- Score: 36.04370583654189
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentially private mechanisms achieving worst-case optimal error bounds (e.g., the classical Laplace mechanism) are well-studied in the literature. However, when typical data are far from the worst case, \emph{instance-specific} error bounds -- which depend on the largest value in the dataset -- are more meaningful. For example, consider the sum estimation problem, where each user has an integer $x_i$ from the domain $\{0,1,\dots,U\}$ and we wish to estimate $\sum_i x_i$. This has a worst-case optimal error of $O(U/\varepsilon)$, while recent work has shown that the clipping mechanism can achieve an instance-optimal error of $O(\max_i x_i \cdot \log\log U /\varepsilon)$. Under the shuffle model, known instance-optimal protocols are less communication-efficient. The clipping mechanism also works in the shuffle model, but requires two rounds: Round one finds the clipping threshold, and round two does the clipping and computes the noisy sum of the clipped data. In this paper, we show how these two seemingly sequential steps can be done simultaneously in one round using just $1+o(1)$ messages per user, while maintaining the instance-optimal error bound. We also extend our technique to the high-dimensional sum estimation problem and sparse vector aggregation (a.k.a. frequency estimation under user-level differential privacy).
Related papers
- Statistical-Computational Trade-offs for Density Estimation [60.81548752871115]
We show that for a broad class of data structures their bounds cannot be significantly improved.
This is a novel emphstatistical-computational trade-off for density estimation.
arXiv Detail & Related papers (2024-10-30T15:03:33Z) - Better Locally Private Sparse Estimation Given Multiple Samples Per User [2.9562742331218725]
We investigate user-level locally differentially private sparse linear regression.
We show that with $n$ users each contributing $m$ samples, the linear dependency of dimension $d$ can be eliminated.
We propose a framework that first selects candidate variables and then conducts estimation in the narrowed low-dimensional space.
arXiv Detail & Related papers (2024-08-08T08:47:20Z) - Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages [63.366380571397]
We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v(i) inmathbbRd$.
We propose a new multi-message protocol that achieves the optimal error using $tildemathcalOleft(min(nvarepsilon2,d)right)$ messages per user.
arXiv Detail & Related papers (2024-04-16T00:56:36Z) - Near-Optimal Non-Convex Stochastic Optimization under Generalized
Smoothness [21.865728815935665]
Two recent works established the $O(epsilon-3)$ sample complexity to obtain an $O(epsilon)$-stationary point.
However, both require a large batch size on the order of $mathrmploy(epsilon-1)$, which is not only computationally burdensome but also unsuitable for streaming applications.
In this work, we solve the prior two problems simultaneously by revisiting a simple variant of the STORM algorithm.
arXiv Detail & Related papers (2023-02-13T00:22:28Z) - Concurrent Shuffle Differential Privacy Under Continual Observation [60.127179363119936]
We study the private continual summation problem (a.k.a. the counter problem)
We show that the concurrent shuffle model allows for significantly improved error compared to a standard (single) shuffle model.
arXiv Detail & Related papers (2023-01-29T20:42:54Z) - Localization in 1D non-parametric latent space models from pairwise
affinities [6.982738885923206]
We consider the problem of estimating latent positions in a one-dimensional torus from pairwise affinities.
We introduce an estimation procedure that provably localizes all the latent positions with a maximum error of the order of $sqrtlog(n)/n$, with high-probability.
arXiv Detail & Related papers (2021-08-06T13:05:30Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Differentially private $k$-means clustering via exponential mechanism
and max cover [6.736814259597673]
We introduce a new $(epsilon_p, delta_p)$-differentially private algorithm for the $k$-means clustering problem.
arXiv Detail & Related papers (2020-09-02T17:52:54Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.