Scalable DP-SGD: Shuffling vs. Poisson Subsampling
- URL: http://arxiv.org/abs/2411.04205v1
- Date: Wed, 06 Nov 2024 19:06:16 GMT
- Title: Scalable DP-SGD: Shuffling vs. Poisson Subsampling
- Authors: Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang,
- Abstract summary: We provide new lower bounds on the privacy guarantee of the multi-epoch Adaptive Linear Queries (ABLQ) mechanism with shuffled batch sampling.
We show substantial gaps when compared to Poisson subsampling; prior analysis was limited to a single epoch.
We introduce a practical approach to implement Poisson subsampling at scale using massively parallel computation.
- Score: 61.19794019914523
- License:
- Abstract: We provide new lower bounds on the privacy guarantee of the multi-epoch Adaptive Batch Linear Queries (ABLQ) mechanism with shuffled batch sampling, demonstrating substantial gaps when compared to Poisson subsampling; prior analysis was limited to a single epoch. Since the privacy analysis of Differentially Private Stochastic Gradient Descent (DP-SGD) is obtained by analyzing the ABLQ mechanism, this brings into serious question the common practice of implementing shuffling-based DP-SGD, but reporting privacy parameters as if Poisson subsampling was used. To understand the impact of this gap on the utility of trained machine learning models, we introduce a practical approach to implement Poisson subsampling at scale using massively parallel computation, and efficiently train models with the same. We compare the utility of models trained with Poisson-subsampling-based DP-SGD, and the optimistic estimates of utility when using shuffling, via our new lower bounds on the privacy guarantee of ABLQ with shuffling.
Related papers
- Balls-and-Bins Sampling for DP-SGD [56.959167524617456]
We introduce the Balls-and-Bins sampling for differentially private (DP) optimization methods such as DP-SGD.
We show that Balls-and-Bins sampling achieves the "best-of-both" samplers, namely, the implementation of Balls-and-Bins sampling is similar to that of Shuffling.
arXiv Detail & Related papers (2024-12-21T23:09:14Z) - To Shuffle or not to Shuffle: Auditing DP-SGD with Shuffling [25.669347036509134]
We analyze Differentially Private Gradient Descent (DP-SGD) with shuffling.
We show that state-of-the-art DP models trained with shuffling appreciably overestimated privacy guarantees (up to 4x)
Our work empirically attests to the risk of using shuffling instead of Poisson sub-sampling vis-a-vis the actual privacy leakage of DP-SGD.
arXiv Detail & Related papers (2024-11-15T22:34:28Z) - Towards Efficient and Scalable Training of Differentially Private Deep Learning [5.825410941577592]
Differentially private gradient descent (DP-SGD) is the standard algorithm for training machine learning models under differential privacy (DP)
Implementing computationally efficient DP-SGD with Poisson subsampling is not trivial, which leads to many implementations ignoring this requirement.
We conduct a comprehensive empirical study to quantify the computational cost of training deep learning models under DP.
We find that using the naive implementation DP-SGD with Opacus in PyTorch has between 2.6 and 8 times lower throughput of processed training examples per second than SGD.
arXiv Detail & Related papers (2024-06-25T06:04:58Z) - Bayesian Frequency Estimation Under Local Differential Privacy With an Adaptive Randomized Response Mechanism [0.4604003661048266]
We propose AdOBEst-LDP, a new algorithm for adaptive, online Bayesian estimation of categorical distributions under local differential privacy.
By adapting the subset selection process to the past privatized data via Bayesian estimation, the algorithm improves the utility of future privatized data.
arXiv Detail & Related papers (2024-05-11T13:59:52Z) - How Private are DP-SGD Implementations? [61.19794019914523]
We show that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
Our result shows that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
arXiv Detail & Related papers (2024-03-26T13:02:43Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - On the Practicality of Differential Privacy in Federated Learning by
Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively.
Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks.
Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.