Rethinking Improved Privacy-Utility Trade-off with Pre-existing Knowledge for DP Training
- URL: http://arxiv.org/abs/2409.03344v1
- Date: Thu, 5 Sep 2024 08:40:54 GMT
- Title: Rethinking Improved Privacy-Utility Trade-off with Pre-existing Knowledge for DP Training
- Authors: Yu Zheng, Wenchao Zhang, Yonggang Zhang, Wei Song, Kai Zhou, Bo Han,
- Abstract summary: We propose a generic differential privacy framework with heterogeneous noise (DP-Hero)
Atop DP-Hero, we instantiate a heterogeneous version of DP-SGD, where the noise injected into gradient updates is heterogeneous and guided by prior-established model parameters.
We conduct comprehensive experiments to verify and explain the effectiveness of the proposed DP-Hero, showing improved training accuracy compared with state-of-the-art works.
- Score: 31.559864332056648
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differential privacy (DP) provides a provable framework for protecting individuals by customizing a random mechanism over a privacy-sensitive dataset. Deep learning models have demonstrated privacy risks in model exposure as an established learning model unintentionally records membership-level privacy leakage. Differentially private stochastic gradient descent (DP- SGD) has been proposed to safeguard training individuals by adding random Gaussian noise to gradient updates in the backpropagation. Researchers identify that DP-SGD typically causes utility loss since the injected homogeneous noise alters the gradient updates calculated at each iteration. Namely, all elements in the gradient are contaminated regardless of their importance in updating model parameters. In this work, we argue that the utility loss mainly results from the homogeneity of injected noise. Consequently, we propose a generic differential privacy framework with heterogeneous noise (DP-Hero) by defining a heterogeneous random mechanism to abstract its property. The insight of DP-Hero is to leverage the knowledge encoded in the previously trained model to guide the subsequent allocation of noise heterogeneity, thereby leveraging the statistical perturbation and achieving enhanced utility. Atop DP-Hero, we instantiate a heterogeneous version of DP-SGD, where the noise injected into gradients is heterogeneous and guided by prior-established model parameters. We conduct comprehensive experiments to verify and explain the effectiveness of the proposed DP-Hero, showing improved training accuracy compared with state-of-the-art works. Broadly, we shed light on improving the privacy-utility space by learning the noise guidance from the pre-existing leaked knowledge encoded in the previously trained model, showing a different perspective of understanding the utility-improved DP training.
Related papers
- DOPPLER: Differentially Private Optimizers with Low-pass Filter for Privacy Noise Reduction [47.65999101635902]
Differentially private (DP) training prevents the leakage of sensitive information in the collected training data from trained machine learning models.
We develop a new component, called DOPPLER, which works by effectively amplifying the gradient while DP noise within this frequency domain.
Our experiments show that the proposed DPs with a lowpass filter outperform their counterparts without the filter by 3%-10% in test accuracy.
arXiv Detail & Related papers (2024-08-24T04:27:07Z) - Weights Shuffling for Improving DPSGD in Transformer-based Models [7.356743536182233]
This work introduces an innovative shuffling mechanism in Differentially-Private Gradient Descent (DPSGD) to enhance the utility of large models at the same privacy guarantee of the unshuffled case.
We show that permutation indeed improves the privacy guarantee of DPSGD in theory, but tracking the exact privacy loss on shuffled model is particularly challenging.
arXiv Detail & Related papers (2024-07-22T06:41:59Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Amplitude-Varying Perturbation for Balancing Privacy and Utility in
Federated Learning [86.08285033925597]
This paper presents a new DP perturbation mechanism with a time-varying noise amplitude to protect the privacy of federated learning.
We derive an online refinement of the series to prevent FL from premature convergence resulting from excessive perturbation noise.
The contribution of the new DP mechanism to the convergence and accuracy of privacy-preserving FL is corroborated, compared to the state-of-the-art Gaussian noise mechanism with a persistent noise amplitude.
arXiv Detail & Related papers (2023-03-07T22:52:40Z) - A Differentially Private Framework for Deep Learning with Convexified
Loss Functions [4.059849656394191]
Differential privacy (DP) has been applied in deep learning for preserving privacy of the underlying training sets.
Existing DP practice falls into three categories - objective perturbation, gradient perturbation and output perturbation.
We propose a novel output perturbation framework by injecting DP noise into a randomly sampled neuron.
arXiv Detail & Related papers (2022-04-03T11:10:05Z) - Differentially Private Generative Adversarial Networks with Model
Inversion [6.651002556438805]
To protect sensitive data in training a Generative Adversarial Network (GAN), the standard approach is to use differentially private (DP) gradient descent method.
We propose Differentially Private Model Inversion (DPMI) method where the private data is first mapped to the latent space via a public generator.
Our approach outperforms the standard DP-GAN method based on Inception Score, Fr'echet Inception Distance, and classification accuracy under the same privacy guarantee.
arXiv Detail & Related papers (2022-01-10T02:26:26Z) - Dynamic Differential-Privacy Preserving SGD [19.273542515320372]
Differentially-Private Gradient Descent (DP-SGD) prevents training-data privacy breaches by adding noise to the clipped gradient during SGD training.
The same clipping operation and additive noise across training steps results in unstable updates and even a ramp-up period.
We propose the dynamic DP-SGD, which has a lower privacy cost than the DP-SGD during updates until they achieve the same target privacy budget.
arXiv Detail & Related papers (2021-10-30T04:45:11Z) - Smoothed Differential Privacy [55.415581832037084]
Differential privacy (DP) is a widely-accepted and widely-applied notion of privacy based on worst-case analysis.
In this paper, we propose a natural extension of DP following the worst average-case idea behind the celebrated smoothed analysis.
We prove that any discrete mechanism with sampling procedures is more private than what DP predicts, while many continuous mechanisms with sampling procedures are still non-private under smoothed DP.
arXiv Detail & Related papers (2021-07-04T06:55:45Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.