Privacy Amplification via Shuffling: Unified, Simplified, and Tightened
- URL: http://arxiv.org/abs/2304.05007v5
- Date: Mon, 29 Jul 2024 01:12:04 GMT
- Title: Privacy Amplification via Shuffling: Unified, Simplified, and Tightened
- Authors: Shaowei Wang, Yun Peng, Jin Li, Zikai Wen, Zhipeng Li, Shiyu Yu, Di Wang, Wei Yang,
- Abstract summary: We propose a comprehensive framework for privacy amplification in both single-message and multi-message shuffle protocols.
Our theoretical results demonstrate that our framework provides tighter bounds, especially for local randomizers with extremal probability design.
Our bounds also result in a remarkably efficient $tildeO(n)$ algorithm that numerically amplifies privacy in less than $10$ seconds for $n=108$ users.
- Score: 20.10078781197001
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The shuffle model of differential privacy provides promising privacy-utility balances in decentralized, privacy-preserving data analysis. However, the current analyses of privacy amplification via shuffling lack both tightness and generality. To address this issue, we propose the \emph{variation-ratio reduction} as a comprehensive framework for privacy amplification in both single-message and multi-message shuffle protocols. It leverages two new parameterizations: the total variation bounds of local messages and the probability ratio bounds of blanket messages, to determine indistinguishability levels. Our theoretical results demonstrate that our framework provides tighter bounds, especially for local randomizers with extremal probability design, where our bounds are exactly tight. Additionally, variation-ratio reduction complements parallel composition in the shuffle model, yielding enhanced privacy accounting for popular sampling-based randomizers employed in statistical queries (e.g., range queries, marginal queries, and frequent itemset mining). Empirical findings demonstrate that our numerical amplification bounds surpass existing ones, conserving up to $30\%$ of the budget for single-message protocols, $75\%$ for multi-message ones, and a striking $75\%$-$95\%$ for parallel composition. Our bounds also result in a remarkably efficient $\tilde{O}(n)$ algorithm that numerically amplifies privacy in less than $10$ seconds for $n=10^8$ users.
Related papers
- Enhanced Privacy Bound for Shuffle Model with Personalized Privacy [32.08637708405314]
Differential Privacy (DP) is an enhanced privacy protocol which introduces an intermediate trusted server between local users and a central data curator.
It significantly amplifies the central DP guarantee by anonymizing and shuffling the local randomized data.
This work focuses on deriving the central privacy bound for a more practical setting where personalized local privacy is required by each user.
arXiv Detail & Related papers (2024-07-25T16:11:56Z) - A Generalized Shuffle Framework for Privacy Amplification: Strengthening Privacy Guarantees and Enhancing Utility [4.7712438974100255]
We show how to shuffle $(epsilon_i,delta_i)$-PLDP setting with personalized privacy parameters.
We prove that shuffled $(epsilon_i,delta_i)$-PLDP process approximately preserves $mu$-Gaussian Differential Privacy with mu = sqrtfrac2sum_i=1n frac1-delta_i1+eepsilon_i-max_ifrac1-delta_i1+e
arXiv Detail & Related papers (2023-12-22T02:31:46Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Nonparametric extensions of randomized response for private confidence sets [51.75485869914048]
This work derives methods for performing nonparametric, nonasymptotic statistical inference for population means under the constraint of local differential privacy (LDP)
We present confidence intervals (CI) and time-uniform confidence sequences (CS) for $mustar$ when only given access to the privatized data.
arXiv Detail & Related papers (2022-02-17T16:04:49Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy
Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data.
Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z) - Learning discrete distributions: user vs item-level privacy [47.05234904407048]
Recently many practical applications such as federated learning require preserving privacy for all items of a single user.
We study the fundamental problem of learning discrete distributions over $k$ symbols with user-level differential privacy.
We propose a mechanism such that the number of users scales as $tildemathcalO(k/(malpha2) + k/sqrtmepsilonalpha)$ and hence the privacy penalty is $tildeTheta(sqrtm)$ times smaller compared to the standard mechanisms.
arXiv Detail & Related papers (2020-07-27T16:15:14Z) - Privacy Amplification via Random Check-Ins [38.72327434015975]
Differentially Private Gradient Descent (DP-SGD) forms a fundamental building block in many applications for learning over sensitive data.
In this paper, we focus on conducting iterative methods like DP-SGD in the setting of federated learning (FL) wherein the data is distributed among many devices (clients)
Our main contribution is the emphrandom check-in distributed protocol, which crucially relies only on randomized participation decisions made locally and independently by each client.
arXiv Detail & Related papers (2020-07-13T18:14:09Z) - Connecting Robust Shuffle Privacy and Pan-Privacy [11.367579037903734]
In the emphshuffle model of differential privacy, data-holding users send randomized messages to a secure shuffler, and the shuffler permutes the messages.
In the emphpan-private model, an algorithm processes a stream of data while maintaining an internal state that is differentially private with regard to the stream data.
Our results focus on emphrobustly shuffle private protocols, whose privacy guarantees are not greatly affected by malicious users.
arXiv Detail & Related papers (2020-04-20T17:58:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.