Generalizing DP-SGD with Shuffling and Batch Clipping
- URL: http://arxiv.org/abs/2212.05796v3
- Date: Tue, 25 Jul 2023 16:57:52 GMT
- Title: Generalizing DP-SGD with Shuffling and Batch Clipping
- Authors: Marten van Dijk, Phuong Ha Nguyen, Toan N. Nguyen and Lam M. Nguyen
- Abstract summary: DP-SGD implements individual clipping with random subsampling, which forces a mini-batch SGD approach.
We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order summings.
We show a $sqrtg E$ DP dependency for batch clipping with shuffling.
- Score: 21.55827140532476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classical differential private DP-SGD implements individual clipping with
random subsampling, which forces a mini-batch SGD approach. We provide a
general differential private algorithmic framework that goes beyond DP-SGD and
allows any possible first order optimizers (e.g., classical SGD and momentum
based SGD approaches) in combination with batch clipping, which clips an
aggregate of computed gradients rather than summing clipped gradients (as is
done in individual clipping). The framework also admits sampling techniques
beyond random subsampling such as shuffling. Our DP analysis follows the $f$-DP
approach and introduces a new proof technique which allows us to derive simple
closed form expressions and to also analyse group privacy. In particular, for
$E$ epochs work and groups of size $g$, we show a $\sqrt{g E}$ DP dependency
for batch clipping with shuffling.
Related papers
- Smoothed Normalization for Efficient Distributed Private Optimization [54.197255548244705]
Federated learning enables machine learning models with privacy of participants.
There is no differentially private distributed method for training, non-feedback problems.
We introduce a new distributed algorithm $alpha$-$sf NormEC$ with provable convergence guarantees.
arXiv Detail & Related papers (2025-02-19T07:10:32Z) - Scalable DP-SGD: Shuffling vs. Poisson Subsampling [61.19794019914523]
We provide new lower bounds on the privacy guarantee of the multi-epoch Adaptive Linear Queries (ABLQ) mechanism with shuffled batch sampling.
We show substantial gaps when compared to Poisson subsampling; prior analysis was limited to a single epoch.
We introduce a practical approach to implement Poisson subsampling at scale using massively parallel computation.
arXiv Detail & Related papers (2024-11-06T19:06:16Z) - How Private are DP-SGD Implementations? [61.19794019914523]
We show that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
Our result shows that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
arXiv Detail & Related papers (2024-03-26T13:02:43Z) - Private Fine-tuning of Large Language Models with Zeroth-order Optimization [51.19403058739522]
Differentially private gradient descent (DP-SGD) allows models to be trained in a privacy-preserving manner.
We introduce DP-ZO, a private fine-tuning framework for large language models by privatizing zeroth order optimization methods.
arXiv Detail & Related papers (2024-01-09T03:53:59Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Batch Clipping and Adaptive Layerwise Clipping for Differential Private
Stochastic Gradient Descent [21.55827140532476]
Differential Private Gradient Descent (DPSGD) transmits a sum of clipped gradients obfuscated with Gaussian noise to a central server.
Batch Clipping (BC) where, instead of clipping single gradients, we average and clip batches of gradients.
Adaptive Layerwise Clipping methods (ALC) where each layer has its own adaptively finetuned clipping constant.
arXiv Detail & Related papers (2023-07-21T23:37:37Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Automatic Clipping: Differentially Private Deep Learning Made Easier and
Stronger [39.93710312222771]
Per-example clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models.
We propose an easy-to-use replacement, called automatic clipping, that eliminates the need to tune R for any DPs.
arXiv Detail & Related papers (2022-06-14T19:49:44Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.