AdaDPIGU: Differentially Private SGD with Adaptive Clipping and Importance-Based Gradient Updates for Deep Neural Networks
- URL: http://arxiv.org/abs/2507.06525v1
- Date: Wed, 09 Jul 2025 03:53:03 GMT
- Title: AdaDPIGU: Differentially Private SGD with Adaptive Clipping and Importance-Based Gradient Updates for Deep Neural Networks
- Authors: Huiqi Zhang, Fang Xie,
- Abstract summary: We propose a new differentially private SGD framework with importance-based gradient updates tailored for deep neural networks.<n>AdaDPIGU satisfies $(varepsilon, delta)$-differential privacy and retains convergence guarantees.<n>On MNIST, our method achieves a test accuracy of 99.12% under a privacy budget of $epsilon = 8$, nearly matching the non-private model.
- Score: 1.7265013728931
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differential privacy has been proven effective for stochastic gradient descent; however, existing methods often suffer from performance degradation in high-dimensional settings, as the scale of injected noise increases with dimensionality. To tackle this challenge, we propose AdaDPIGU--a new differentially private SGD framework with importance-based gradient updates tailored for deep neural networks. In the pretraining stage, we apply a differentially private Gaussian mechanism to estimate the importance of each parameter while preserving privacy. During the gradient update phase, we prune low-importance coordinates and introduce a coordinate-wise adaptive clipping mechanism, enabling sparse and noise-efficient gradient updates. Theoretically, we prove that AdaDPIGU satisfies $(\varepsilon, \delta)$-differential privacy and retains convergence guarantees. Extensive experiments on standard benchmarks validate the effectiveness of AdaDPIGU. All results are reported under a fixed retention ratio of 60%. On MNIST, our method achieves a test accuracy of 99.12% under a privacy budget of $\epsilon = 8$, nearly matching the non-private model. Remarkably, on CIFAR-10, it attains 73.21% accuracy at $\epsilon = 4$, outperforming the non-private baseline of 71.12%, demonstrating that adaptive sparsification can enhance both privacy and utility.
Related papers
- DC-SGD: Differentially Private SGD with Dynamic Clipping through Gradient Norm Distribution Estimation [11.216548916537699]
We propose Dynamic Clipping DP-SGD (DC-SGD), a framework that dynamically adjust the clipping threshold C.<n>DC-SGD-P adjusts the clipping threshold based on a percentile of gradient norms, while DC-SGD-E minimizes the expected squared error of gradients to optimize C.<n>Our results highlight the robust performance and efficiency of DC-SGD, offering a practical solution for differentially private deep learning.
arXiv Detail & Related papers (2025-03-29T06:27:22Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Sparsity-Preserving Differentially Private Training of Large Embedding
Models [67.29926605156788]
DP-SGD is a training algorithm that combines differential privacy with gradient descent.
Applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency.
We present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models.
arXiv Detail & Related papers (2023-11-14T17:59:51Z) - Bias-Aware Minimisation: Understanding and Mitigating Estimator Bias in
Private SGD [56.01810892677744]
We show a connection between per-sample gradient norms and the estimation bias of the private gradient oracle used in DP-SGD.
We propose Bias-Aware Minimisation (BAM) that allows for the provable reduction of private gradient estimator bias.
arXiv Detail & Related papers (2023-08-23T09:20:41Z) - Differentially Private Learning with Per-Sample Adaptive Clipping [8.401653565794353]
We propose a Differentially Private Per-Sample Adaptive Clipping (DP-PSAC) algorithm based on a non-monotonic adaptive weight function.
We show that DP-PSAC outperforms or matches the state-of-the-art methods on multiple main-stream vision and language tasks.
arXiv Detail & Related papers (2022-12-01T07:26:49Z) - SA-DPSGD: Differentially Private Stochastic Gradient Descent based on
Simulated Annealing [25.25065807901922]
Differentially private gradient descent is the most popular training method with differential privacy in image recognition.
Existing DPSGD schemes lead to significant performance degradation, which prevents the application of differential privacy.
We propose a simulated annealing-based differentially private gradient descent scheme (SA-DPSGD) which accepts a candidate update with a probability that depends on the update quality and on the number of iterations.
arXiv Detail & Related papers (2022-11-14T09:20:48Z) - Dynamic Differential-Privacy Preserving SGD [19.273542515320372]
Differentially-Private Gradient Descent (DP-SGD) prevents training-data privacy breaches by adding noise to the clipped gradient during SGD training.
The same clipping operation and additive noise across training steps results in unstable updates and even a ramp-up period.
We propose the dynamic DP-SGD, which has a lower privacy cost than the DP-SGD during updates until they achieve the same target privacy budget.
arXiv Detail & Related papers (2021-10-30T04:45:11Z) - Differentially private training of neural networks with Langevin
dynamics forcalibrated predictive uncertainty [58.730520380312676]
We show that differentially private gradient descent (DP-SGD) can yield poorly calibrated, overconfident deep learning models.
This represents a serious issue for safety-critical applications, e.g. in medical diagnosis.
arXiv Detail & Related papers (2021-07-09T08:14:45Z) - Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for
Private Learning [74.73901662374921]
A differentially private model degrades the utility drastically when the model comprises a large number of trainable parameters.
We propose an algorithm emphGradient Embedding Perturbation (GEP) towards training differentially private deep models with decent accuracy.
arXiv Detail & Related papers (2021-02-25T04:29:58Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.