DPIS: An Enhanced Mechanism for Differentially Private SGD with
Importance Sampling
- URL: http://arxiv.org/abs/2210.09634v2
- Date: Wed, 19 Oct 2022 02:11:08 GMT
- Title: DPIS: An Enhanced Mechanism for Differentially Private SGD with
Importance Sampling
- Authors: Jianxin Wei, Ergute Bao, Xiaokui Xiao, Yin Yang
- Abstract summary: differential privacy (DP) has become a well-accepted standard for privacy protection, and deep neural networks (DNN) have been immensely successful in machine learning.
A classic mechanism for this purpose is DP-SGD, which is a differentially private version of the gradient descent (SGD) commonly used for training.
We propose DPIS, a novel mechanism for differentially private SGD training that can be used as a drop-in replacement of the core of DP-SGD.
- Score: 19.59757201902467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Nowadays, differential privacy (DP) has become a well-accepted standard for
privacy protection, and deep neural networks (DNN) have been immensely
successful in machine learning. The combination of these two techniques, i.e.,
deep learning with differential privacy, promises the privacy-preserving
release of high-utility models trained with sensitive data such as medical
records. A classic mechanism for this purpose is DP-SGD, which is a
differentially private version of the stochastic gradient descent (SGD)
optimizer commonly used for DNN training. Subsequent approaches have improved
various aspects of the model training process, including noise decay schedule,
model architecture, feature engineering, and hyperparameter tuning. However,
the core mechanism for enforcing DP in the SGD optimizer remains unchanged ever
since the original DP-SGD algorithm, which has increasingly become a
fundamental barrier limiting the performance of DP-compliant machine learning
solutions.
Motivated by this, we propose DPIS, a novel mechanism for differentially
private SGD training that can be used as a drop-in replacement of the core
optimizer of DP-SGD, with consistent and significant accuracy gains over the
latter. The main idea is to employ importance sampling (IS) in each SGD
iteration for mini-batch selection, which reduces both sampling variance and
the amount of random noise injected to the gradients that is required to
satisfy DP. Integrating IS into the complex mathematical machinery of DP-SGD is
highly non-trivial. DPIS addresses the challenge through novel mechanism
designs, fine-grained privacy analysis, efficiency enhancements, and an
adaptive gradient clipping optimization. Extensive experiments on four
benchmark datasets, namely MNIST, FMNIST, CIFAR-10 and IMDb, demonstrate the
superior effectiveness of DPIS over existing solutions for deep learning with
differential privacy.
Related papers
- DPAdapter: Improving Differentially Private Deep Learning through Noise
Tolerance Pre-training [33.935692004427175]
We introduce DPAdapter, a pioneering technique designed to amplify the model performance of DPML algorithms by enhancing parameter robustness.
Our experiments show that DPAdapter vastly enhances state-of-the-art DPML algorithms, increasing average accuracy from 72.92% to 77.09%.
arXiv Detail & Related papers (2024-03-05T00:58:34Z) - Improving the Privacy and Practicality of Objective Perturbation for
Differentially Private Linear Learners [21.162924003105484]
This paper revamps the objective perturbation mechanism with tighter privacy analyses and new computational tools.
DP-SGD requires a non-trivial privacy overhead and a computational complexity which might be extravagant for simple models such as linear and logistic regression.
arXiv Detail & Related papers (2023-12-31T20:32:30Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Sparsity-Preserving Differentially Private Training of Large Embedding
Models [67.29926605156788]
DP-SGD is a training algorithm that combines differential privacy with gradient descent.
Applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency.
We present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models.
arXiv Detail & Related papers (2023-11-14T17:59:51Z) - Automatic Clipping: Differentially Private Deep Learning Made Easier and
Stronger [39.93710312222771]
Per-example clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models.
We propose an easy-to-use replacement, called automatic clipping, that eliminates the need to tune R for any DPs.
arXiv Detail & Related papers (2022-06-14T19:49:44Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Dynamic Differential-Privacy Preserving SGD [19.273542515320372]
Differentially-Private Gradient Descent (DP-SGD) prevents training-data privacy breaches by adding noise to the clipped gradient during SGD training.
The same clipping operation and additive noise across training steps results in unstable updates and even a ramp-up period.
We propose the dynamic DP-SGD, which has a lower privacy cost than the DP-SGD during updates until they achieve the same target privacy budget.
arXiv Detail & Related papers (2021-10-30T04:45:11Z) - Large Language Models Can Be Strong Differentially Private Learners [70.0317718115406]
Differentially Private (DP) learning has seen limited success for building large deep learning models of text.
We show that this performance drop can be mitigated with the use of large pretrained models.
We propose a memory saving technique that allows clipping in DP-SGD to run without instantiating per-example gradients.
arXiv Detail & Related papers (2021-10-12T01:45:27Z) - NeuralDP Differentially private neural networks by design [61.675604648670095]
We propose NeuralDP, a technique for privatising activations of some layer within a neural network.
We experimentally demonstrate on two datasets that our method offers substantially improved privacy-utility trade-offs compared to DP-SGD.
arXiv Detail & Related papers (2021-07-30T12:40:19Z) - An Efficient DP-SGD Mechanism for Large Scale NLP Models [28.180412581994485]
Data used to train Natural Language Understanding (NLU) models may contain private information such as addresses or phone numbers.
It is desirable that underlying models do not expose private information contained in the training data.
Differentially Private Gradient Descent (DP-SGD) has been proposed as a mechanism to build privacy-preserving models.
arXiv Detail & Related papers (2021-07-14T15:23:27Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.