Auto DP-SGD: Dual Improvements of Privacy and Accuracy via Automatic
Clipping Threshold and Noise Multiplier Estimation
- URL: http://arxiv.org/abs/2312.02400v1
- Date: Tue, 5 Dec 2023 00:09:57 GMT
- Title: Auto DP-SGD: Dual Improvements of Privacy and Accuracy via Automatic
Clipping Threshold and Noise Multiplier Estimation
- Authors: Sai Venkatesh Chilukoti, Md Imran Hossen, Liqun Shan, Vijay Srinivas
Tida, and Xiai Hei
- Abstract summary: DP-SGD has emerged as a popular method to protect personally identifiable information in deep learning applications.
We propose an Auto DP-SGD that scales the gradients of each training sample without losing gradient information.
We demonstrate that Auto DP-SGD outperforms existing SOTA DP-SGD methods in privacy and accuracy on various benchmark datasets.
- Score: 1.7942265700058988
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: DP-SGD has emerged as a popular method to protect personally identifiable
information in deep learning applications. Unfortunately, DP-SGD's per-sample
gradient clipping and uniform noise addition during training can significantly
degrade model utility. To enhance the model's utility, researchers proposed
various adaptive DP-SGD methods. However, we examine and discover that these
techniques result in greater privacy leakage or lower accuracy than the
traditional DP-SGD method, or a lack of evaluation on a complex data set such
as CIFAR100. To address these limitations, we propose an Auto DP-SGD. Our
method automates clipping threshold estimation based on the DL model's gradient
norm and scales the gradients of each training sample without losing gradient
information. This helps to improve the algorithm's utility while using a less
privacy budget. To further improve accuracy, we introduce automatic noise
multiplier decay mechanisms to decrease the noise multiplier after every epoch.
Finally, we develop closed-form mathematical expressions using tCDP accountant
for automatic noise multiplier and automatic clipping threshold estimation.
Through extensive experimentation, we demonstrate that Auto DP-SGD outperforms
existing SOTA DP-SGD methods in privacy and accuracy on various benchmark
datasets. We also show that privacy can be improved by lowering the scale
factor and using learning rate schedulers without significantly reducing
accuracy. Specifically, Auto DP-SGD, when used with a step noise multiplier,
improves accuracy by 3.20, 1.57, 6.73, and 1.42 for the MNIST, CIFAR10,
CIFAR100, and AG News Corpus datasets, respectively. Furthermore, it obtains a
substantial reduction in the privacy budget of 94.9, 79.16, 67.36, and 53.37
for the corresponding data sets.
Related papers
- Rethinking Improved Privacy-Utility Trade-off with Pre-existing Knowledge for DP Training [31.559864332056648]
We propose a generic differential privacy framework with heterogeneous noise (DP-Hero)
Atop DP-Hero, we instantiate a heterogeneous version of DP-SGD, where the noise injected into gradient updates is heterogeneous and guided by prior-established model parameters.
We conduct comprehensive experiments to verify and explain the effectiveness of the proposed DP-Hero, showing improved training accuracy compared with state-of-the-art works.
arXiv Detail & Related papers (2024-09-05T08:40:54Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Sparsity-Preserving Differentially Private Training of Large Embedding
Models [67.29926605156788]
DP-SGD is a training algorithm that combines differential privacy with gradient descent.
Applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency.
We present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models.
arXiv Detail & Related papers (2023-11-14T17:59:51Z) - Bias-Aware Minimisation: Understanding and Mitigating Estimator Bias in
Private SGD [56.01810892677744]
We show a connection between per-sample gradient norms and the estimation bias of the private gradient oracle used in DP-SGD.
We propose Bias-Aware Minimisation (BAM) that allows for the provable reduction of private gradient estimator bias.
arXiv Detail & Related papers (2023-08-23T09:20:41Z) - DPIS: An Enhanced Mechanism for Differentially Private SGD with Importance Sampling [23.8561225168394]
differential privacy (DP) has become a well-accepted standard for privacy protection, and deep neural networks (DNN) have been immensely successful in machine learning.
A classic mechanism for this purpose is DP-SGD, which is a differentially private version of the gradient descent (SGD) commonly used for training.
We propose DPIS, a novel mechanism for differentially private SGD training that can be used as a drop-in replacement of the core of DP-SGD.
arXiv Detail & Related papers (2022-10-18T07:03:14Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Automatic Clipping: Differentially Private Deep Learning Made Easier and
Stronger [39.93710312222771]
Per-example clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models.
We propose an easy-to-use replacement, called automatic clipping, that eliminates the need to tune R for any DPs.
arXiv Detail & Related papers (2022-06-14T19:49:44Z) - Dynamic Differential-Privacy Preserving SGD [19.273542515320372]
Differentially-Private Gradient Descent (DP-SGD) prevents training-data privacy breaches by adding noise to the clipped gradient during SGD training.
The same clipping operation and additive noise across training steps results in unstable updates and even a ramp-up period.
We propose the dynamic DP-SGD, which has a lower privacy cost than the DP-SGD during updates until they achieve the same target privacy budget.
arXiv Detail & Related papers (2021-10-30T04:45:11Z) - Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for
Private Learning [74.73901662374921]
A differentially private model degrades the utility drastically when the model comprises a large number of trainable parameters.
We propose an algorithm emphGradient Embedding Perturbation (GEP) towards training differentially private deep models with decent accuracy.
arXiv Detail & Related papers (2021-02-25T04:29:58Z) - Improving Deep Learning with Differential Privacy using Gradient
Encoding and Denoising [36.935465903971014]
In this paper, we aim at training deep learning models with differential privacy guarantees.
Our key technique is to encode gradients to map them to a smaller vector space.
We show that our mechanism outperforms the state-of-the-art DPSGD.
arXiv Detail & Related papers (2020-07-22T16:33:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.