Related papers: Auto DP-SGD: Dual Improvements of Privacy and Accuracy via Automatic Clipping Threshold and Noise Multiplier Estimation

Auto DP-SGD: Dual Improvements of Privacy and Accuracy via Automatic Clipping Threshold and Noise Multiplier Estimation

URL: http://arxiv.org/abs/2312.02400v1
Date: Tue, 5 Dec 2023 00:09:57 GMT
Title: Auto DP-SGD: Dual Improvements of Privacy and Accuracy via Automatic Clipping Threshold and Noise Multiplier Estimation
Authors: Sai Venkatesh Chilukoti, Md Imran Hossen, Liqun Shan, Vijay Srinivas Tida, and Xiai Hei
Abstract summary: DP-SGD has emerged as a popular method to protect personally identifiable information in deep learning applications. We propose an Auto DP-SGD that scales the gradients of each training sample without losing gradient information. We demonstrate that Auto DP-SGD outperforms existing SOTA DP-SGD methods in privacy and accuracy on various benchmark datasets.
Score: 1.7942265700058988
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: DP-SGD has emerged as a popular method to protect personally identifiable information in deep learning applications. Unfortunately, DP-SGD's per-sample gradient clipping and uniform noise addition during training can significantly degrade model utility. To enhance the model's utility, researchers proposed various adaptive DP-SGD methods. However, we examine and discover that these techniques result in greater privacy leakage or lower accuracy than the traditional DP-SGD method, or a lack of evaluation on a complex data set such as CIFAR100. To address these limitations, we propose an Auto DP-SGD. Our method automates clipping threshold estimation based on the DL model's gradient norm and scales the gradients of each training sample without losing gradient information. This helps to improve the algorithm's utility while using a less privacy budget. To further improve accuracy, we introduce automatic noise multiplier decay mechanisms to decrease the noise multiplier after every epoch. Finally, we develop closed-form mathematical expressions using tCDP accountant for automatic noise multiplier and automatic clipping threshold estimation. Through extensive experimentation, we demonstrate that Auto DP-SGD outperforms existing SOTA DP-SGD methods in privacy and accuracy on various benchmark datasets. We also show that privacy can be improved by lowering the scale factor and using learning rate schedulers without significantly reducing accuracy. Specifically, Auto DP-SGD, when used with a step noise multiplier, improves accuracy by 3.20, 1.57, 6.73, and 1.42 for the MNIST, CIFAR10, CIFAR100, and AG News Corpus datasets, respectively. Furthermore, it obtains a substantial reduction in the privacy budget of 94.9, 79.16, 67.36, and 53.37 for the corresponding data sets.

Related papers

Technical Report: Full Version of Analyzing and Optimizing Perturbation of DP-SGD Geometrically [7.905629859216635]
We first generalize DP-SGD and theoretically derive the impact of DP noise on the training process. Our analysis reveals that, in terms of a perturbed gradient, only the noise on direction has eminent impact on the model efficiency. We design a geometric strategy GeoDP within the DP framework, which perturbs the direction and the magnitude of a gradient.
arXiv Detail & Related papers (2025-04-08T02:26:10Z)
Rethinking Improved Privacy-Utility Trade-off with Pre-existing Knowledge for DP Training [31.559864332056648]
We propose a generic differential privacy framework with heterogeneous noise (DP-Hero) Atop DP-Hero, we instantiate a heterogeneous version of DP-SGD, where the noise injected into gradient updates is heterogeneous and guided by prior-established model parameters. We conduct comprehensive experiments to verify and explain the effectiveness of the proposed DP-Hero, showing improved training accuracy compared with state-of-the-art works.
arXiv Detail & Related papers (2024-09-05T08:40:54Z)
Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation. We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC. We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z)
Sparsity-Preserving Differentially Private Training of Large Embedding Models [67.29926605156788]
DP-SGD is a training algorithm that combines differential privacy with gradient descent. Applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency. We present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models.
arXiv Detail & Related papers (2023-11-14T17:59:51Z)
Bias-Aware Minimisation: Understanding and Mitigating Estimator Bias in Private SGD [56.01810892677744]
We show a connection between per-sample gradient norms and the estimation bias of the private gradient oracle used in DP-SGD. We propose Bias-Aware Minimisation (BAM) that allows for the provable reduction of private gradient estimator bias.
arXiv Detail & Related papers (2023-08-23T09:20:41Z)
Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy [67.33715954653098]
We propose a novel DPFL algorithm named DP-FedSAM, which leverages gradient perturbation to mitigate the negative impact of DP. Specifically, DP-FedSAM integrates Sharpness Aware of Minimization (SAM) to generate local flatness models with stability and weight robustness. To further reduce the magnitude random noise while achieving better performance, we propose DP-FedSAM-$top_k$ by adopting the local update sparsification technique.
arXiv Detail & Related papers (2023-05-01T15:19:09Z)
DPIS: An Enhanced Mechanism for Differentially Private SGD with Importance Sampling [23.8561225168394]
differential privacy (DP) has become a well-accepted standard for privacy protection, and deep neural networks (DNN) have been immensely successful in machine learning. A classic mechanism for this purpose is DP-SGD, which is a differentially private version of the gradient descent (SGD) commonly used for training. We propose DPIS, a novel mechanism for differentially private SGD training that can be used as a drop-in replacement of the core of DP-SGD.
arXiv Detail & Related papers (2022-10-18T07:03:14Z)
TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently. We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements. We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z)
Recycling Scraps: Improving Private Learning by Leveraging Intermediate Checkpoints [20.533039211835902]
We design a general framework that uses aggregates of intermediate checkpoints emphduring training to increase the accuracy of DP ML techniques. We demonstrate that training over aggregates can provide significant gains in prediction accuracy over the existing state-of-the-art for StackOverflow, CIFAR10 and CIFAR100 datasets. Our methods achieve relative improvements of 0.54% and 62.6% in terms of utility and variance, on a proprietary, production-grade pCVR task.
arXiv Detail & Related papers (2022-10-04T19:21:00Z)
Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data. We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z)
Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger [39.93710312222771]
Per-example clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models. We propose an easy-to-use replacement, called automatic clipping, that eliminates the need to tune R for any DPs.
arXiv Detail & Related papers (2022-06-14T19:49:44Z)
Large Scale Transfer Learning for Differentially Private Image Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy. Private training using DP-SGD protects against leakage by injecting noise into individual example gradients. While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z)
DP-FP: Differentially Private Forward Propagation for Large Models [2.062295244789704]
We show how to mitigate the performance drop by replacing the Differential Private Gradient Descent with a novel DP Forward-Propagation (DP-FP) Our DP-FP achieves an average accuracy of 91.34% with privacy budgets less than 3, representing a 3.81% performance improvement over the state-of-the-art DP-SGD.
arXiv Detail & Related papers (2021-12-29T07:32:29Z)
Dynamic Differential-Privacy Preserving SGD [19.273542515320372]
Differentially-Private Gradient Descent (DP-SGD) prevents training-data privacy breaches by adding noise to the clipped gradient during SGD training. The same clipping operation and additive noise across training steps results in unstable updates and even a ramp-up period. We propose the dynamic DP-SGD, which has a lower privacy cost than the DP-SGD during updates until they achieve the same target privacy budget.
arXiv Detail & Related papers (2021-10-30T04:45:11Z)
Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning [74.73901662374921]
A differentially private model degrades the utility drastically when the model comprises a large number of trainable parameters. We propose an algorithm emphGradient Embedding Perturbation (GEP) towards training differentially private deep models with decent accuracy.
arXiv Detail & Related papers (2021-02-25T04:29:58Z)
Improving Deep Learning with Differential Privacy using Gradient Encoding and Denoising [36.935465903971014]
In this paper, we aim at training deep learning models with differential privacy guarantees. Our key technique is to encode gradients to map them to a smaller vector space. We show that our mechanism outperforms the state-of-the-art DPSGD.
arXiv Detail & Related papers (2020-07-22T16:33:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.