On the Convergence and Calibration of Deep Learning with Differential
Privacy
- URL: http://arxiv.org/abs/2106.07830v6
- Date: Mon, 19 Jun 2023 15:13:37 GMT
- Title: On the Convergence and Calibration of Deep Learning with Differential
Privacy
- Authors: Zhiqi Bu, Hua Wang, Zongyu Dai, Qi Long
- Abstract summary: Differentially private (DP) training preserves the data privacy usually at the cost of slower convergence.
We show that noise addition only affects the privacy risk but not the convergence or calibration.
In sharp contrast, DP models trained with large clipping norm enjoy the same privacy guarantee and similar accuracy, but are significantly more textitd
- Score: 12.297499996547925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentially private (DP) training preserves the data privacy usually at
the cost of slower convergence (and thus lower accuracy), as well as more
severe mis-calibration than its non-private counterpart. To analyze the
convergence of DP training, we formulate a continuous time analysis through the
lens of neural tangent kernel (NTK), which characterizes the per-sample
gradient clipping and the noise addition in DP training, for arbitrary network
architectures and loss functions. Interestingly, we show that the noise
addition only affects the privacy risk but not the convergence or calibration,
whereas the per-sample gradient clipping (under both flat and layerwise
clipping styles) only affects the convergence and calibration.
Furthermore, we observe that while DP models trained with small clipping norm
usually achieve the best accurate, but are poorly calibrated and thus
unreliable. In sharp contrast, DP models trained with large clipping norm enjoy
the same privacy guarantee and similar accuracy, but are significantly more
\textit{calibrated}. Our code can be found at
\url{https://github.com/woodyx218/opacus_global_clipping}.
Related papers
- Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - Differentially Private Learning with Per-Sample Adaptive Clipping [8.401653565794353]
We propose a Differentially Private Per-Sample Adaptive Clipping (DP-PSAC) algorithm based on a non-monotonic adaptive weight function.
We show that DP-PSAC outperforms or matches the state-of-the-art methods on multiple main-stream vision and language tasks.
arXiv Detail & Related papers (2022-12-01T07:26:49Z) - Adap DP-FL: Differentially Private Federated Learning with Adaptive
Noise [30.005017338416327]
Federated learning seeks to address the issue of isolated data islands by making clients disclose only their local training models.
Recently, differential privacy has been applied to federated learning to protect data privacy, but the noise added may degrade the learning performance much.
We propose a differentially private scheme for federated learning with adaptive noise (Adap DP-FL)
arXiv Detail & Related papers (2022-11-29T03:20:40Z) - Fine-Tuning with Differential Privacy Necessitates an Additional
Hyperparameter Search [38.83524780461911]
We show how carefully selecting the layers being fine-tuned in the pretrained neural network allows us to establish new state-of-the-art tradeoffs between privacy and accuracy.
We achieve 77.9% accuracy for $(varepsilon, delta)= (2, 10-5)$ on CIFAR-100 for a model pretrained on ImageNet.
arXiv Detail & Related papers (2022-10-05T11:32:49Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Differentially Private Learning Needs Hidden State (Or Much Faster
Convergence) [9.429448411561541]
We show that differentially private learning, with a tight bound, needs hidden state privacy analysis or a fast convergence.
Our converging privacy analysis, thus, shows that differentially private learning, with a tight bound, needs hidden state privacy analysis or a fast convergence.
arXiv Detail & Related papers (2022-03-10T13:31:08Z) - Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for
Private Learning [74.73901662374921]
A differentially private model degrades the utility drastically when the model comprises a large number of trainable parameters.
We propose an algorithm emphGradient Embedding Perturbation (GEP) towards training differentially private deep models with decent accuracy.
arXiv Detail & Related papers (2021-02-25T04:29:58Z) - Understanding Gradient Clipping in Private SGD: A Geometric Perspective [68.61254575987013]
Deep learning models are increasingly popular in many machine learning applications where the training data may contain sensitive information.
Many learning systems now incorporate differential privacy by training their models with (differentially) private SGD.
A key step in each private SGD update is gradient clipping that shrinks the gradient of an individual example whenever its L2 norm exceeds some threshold.
arXiv Detail & Related papers (2020-06-27T19:08:12Z) - Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness.
We show that focal loss allows us to learn models that are already very well calibrated.
We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.