DP-SGD-Global-Adapt-V2-S: Triad Improvements of Privacy, Accuracy and Fairness via Step Decay Noise Multiplier and Step Decay Upper Clipping Threshold
- URL: http://arxiv.org/abs/2312.02400v2
- Date: Wed, 05 Feb 2025 18:23:11 GMT
- Title: DP-SGD-Global-Adapt-V2-S: Triad Improvements of Privacy, Accuracy and Fairness via Step Decay Noise Multiplier and Step Decay Upper Clipping Threshold
- Authors: Sai Venkatesh Chilukoti, Md Imran Hossen, Liqun Shan, Vijay Srinivas Tida, Mahathir Mohammad Bappy, Wenmeng Tian, Xiai Hei,
- Abstract summary: DPSGD's per-sample gradient clipping and uniform noise addition during training can significantly degrade model utility and fairness.
We develop the DP-SGD-Global-Adapt-V2-S, which has a step-decay noise multiplier and an upper clipping threshold that is also decayed step-wise.
It improves accuracy by 0.9795%, 0.6786%, and 4.0130% in MNIST, CIFAR10, and CIFAR100, respectively.
- Score: 1.8741812670237725
- License:
- Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) has become a widely used technique for safeguarding sensitive information in deep learning applications. Unfortunately, DPSGD's per-sample gradient clipping and uniform noise addition during training can significantly degrade model utility and fairness. We observe that the latest DP-SGD-Global-Adapt's average gradient norm is the same throughout the training. Even when it is integrated with the existing linear decay noise multiplier, it has little or no advantage. Moreover, we notice that its upper clipping threshold increases exponentially towards the end of training, potentially impacting the models convergence. Other algorithms, DP-PSAC, Auto-S, DP-SGD-Global, and DP-F, have utility and fairness that are similar to or worse than DP-SGD, as demonstrated in experiments. To overcome these problems and improve utility and fairness, we developed the DP-SGD-Global-Adapt-V2-S. It has a step-decay noise multiplier and an upper clipping threshold that is also decayed step-wise. DP-SGD-Global-Adapt-V2-S with a privacy budget ($\epsilon$) of 1 improves accuracy by 0.9795\%, 0.6786\%, and 4.0130\% in MNIST, CIFAR10, and CIFAR100, respectively. It also reduces the privacy cost gap ($\pi$) by 89.8332% and 60.5541% in unbalanced MNIST and Thinwall datasets, respectively. Finally, we develop mathematical expressions to compute the privacy budget using truncated concentrated differential privacy (tCDP) for DP-SGD-Global-Adapt-V2-T and DP-SGD-Global-Adapt-V2-S.
Related papers
- Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Sparsity-Preserving Differentially Private Training of Large Embedding
Models [67.29926605156788]
DP-SGD is a training algorithm that combines differential privacy with gradient descent.
Applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency.
We present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models.
arXiv Detail & Related papers (2023-11-14T17:59:51Z) - Bias-Aware Minimisation: Understanding and Mitigating Estimator Bias in
Private SGD [56.01810892677744]
We show a connection between per-sample gradient norms and the estimation bias of the private gradient oracle used in DP-SGD.
We propose Bias-Aware Minimisation (BAM) that allows for the provable reduction of private gradient estimator bias.
arXiv Detail & Related papers (2023-08-23T09:20:41Z) - Towards the Flatter Landscape and Better Generalization in Federated
Learning under Client-level Differential Privacy [67.33715954653098]
We propose a novel DPFL algorithm named DP-FedSAM, which leverages gradient perturbation to mitigate the negative impact of DP.
Specifically, DP-FedSAM integrates Sharpness Aware of Minimization (SAM) to generate local flatness models with stability and weight robustness.
To further reduce the magnitude random noise while achieving better performance, we propose DP-FedSAM-$top_k$ by adopting the local update sparsification technique.
arXiv Detail & Related papers (2023-05-01T15:19:09Z) - DPIS: An Enhanced Mechanism for Differentially Private SGD with Importance Sampling [23.8561225168394]
differential privacy (DP) has become a well-accepted standard for privacy protection, and deep neural networks (DNN) have been immensely successful in machine learning.
A classic mechanism for this purpose is DP-SGD, which is a differentially private version of the gradient descent (SGD) commonly used for training.
We propose DPIS, a novel mechanism for differentially private SGD training that can be used as a drop-in replacement of the core of DP-SGD.
arXiv Detail & Related papers (2022-10-18T07:03:14Z) - Recycling Scraps: Improving Private Learning by Leveraging Intermediate Checkpoints [20.533039211835902]
We design a general framework that uses aggregates of intermediate checkpoints emphduring training to increase the accuracy of DP ML techniques.
We demonstrate that training over aggregates can provide significant gains in prediction accuracy over the existing state-of-the-art for StackOverflow, CIFAR10 and CIFAR100 datasets.
Our methods achieve relative improvements of 0.54% and 62.6% in terms of utility and variance, on a proprietary, production-grade pCVR task.
arXiv Detail & Related papers (2022-10-04T19:21:00Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Automatic Clipping: Differentially Private Deep Learning Made Easier and
Stronger [39.93710312222771]
Per-example clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models.
We propose an easy-to-use replacement, called automatic clipping, that eliminates the need to tune R for any DPs.
arXiv Detail & Related papers (2022-06-14T19:49:44Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - DP-FP: Differentially Private Forward Propagation for Large Models [2.062295244789704]
We show how to mitigate the performance drop by replacing the Differential Private Gradient Descent with a novel DP Forward-Propagation (DP-FP)
Our DP-FP achieves an average accuracy of 91.34% with privacy budgets less than 3, representing a 3.81% performance improvement over the state-of-the-art DP-SGD.
arXiv Detail & Related papers (2021-12-29T07:32:29Z) - Dynamic Differential-Privacy Preserving SGD [19.273542515320372]
Differentially-Private Gradient Descent (DP-SGD) prevents training-data privacy breaches by adding noise to the clipped gradient during SGD training.
The same clipping operation and additive noise across training steps results in unstable updates and even a ramp-up period.
We propose the dynamic DP-SGD, which has a lower privacy cost than the DP-SGD during updates until they achieve the same target privacy budget.
arXiv Detail & Related papers (2021-10-30T04:45:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.