DP-SGD vs PATE: Which Has Less Disparate Impact on GANs?
- URL: http://arxiv.org/abs/2111.13617v1
- Date: Fri, 26 Nov 2021 17:25:46 GMT
- Title: DP-SGD vs PATE: Which Has Less Disparate Impact on GANs?
- Authors: Georgi Ganev
- Abstract summary: We compare GANs trained with the two best-known DP frameworks for deep learning, DP-SGD, and PATE, in different data imbalance settings.
Our experiments consistently show that for PATE, unlike DP-SGD, the privacy-utility trade-off is not monotonically decreasing.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) are among the most popular approaches
to generate synthetic data, especially images, for data sharing purposes. Given
the vital importance of preserving the privacy of the individual data points in
the original data, GANs are trained utilizing frameworks with robust privacy
guarantees such as Differential Privacy (DP). However, these approaches remain
widely unstudied beyond single performance metrics when presented with
imbalanced datasets. To this end, we systematically compare GANs trained with
the two best-known DP frameworks for deep learning, DP-SGD, and PATE, in
different data imbalance settings from two perspectives -- the size of the
classes in the generated synthetic data and their classification performance.
Our analyses show that applying PATE, similarly to DP-SGD, has a disparate
effect on the under/over-represented classes but in a much milder magnitude
making it more robust. Interestingly, our experiments consistently show that
for PATE, unlike DP-SGD, the privacy-utility trade-off is not monotonically
decreasing but is much smoother and inverted U-shaped, meaning that adding a
small degree of privacy actually helps generalization. However, we have also
identified some settings (e.g., large imbalance) where PATE-GAN completely
fails to learn some subparts of the training data.
Related papers
- Incentives in Private Collaborative Machine Learning [56.84263918489519]
Collaborative machine learning involves training models on data from multiple parties.
We introduce differential privacy (DP) as an incentive.
We empirically demonstrate the effectiveness and practicality of our approach on synthetic and real-world datasets.
arXiv Detail & Related papers (2024-04-02T06:28:22Z) - How Private are DP-SGD Implementations? [61.19794019914523]
We show that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
Our result shows that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
arXiv Detail & Related papers (2024-03-26T13:02:43Z) - Pre-training Differentially Private Models with Limited Public Data [54.943023722114134]
differential privacy (DP) is a prominent method to gauge the degree of security provided to the models.
DP is yet not capable of protecting a substantial portion of the data used during the initial pre-training stage.
We develop a novel DP continual pre-training strategy using only 10% of public data.
Our strategy can achieve DP accuracy of 41.5% on ImageNet-21k, as well as non-DP accuracy of 55.7% and and 60.0% on downstream tasks Places365 and iNaturalist-2021.
arXiv Detail & Related papers (2024-02-28T23:26:27Z) - Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD [44.11069254181353]
We show that DP-SGD leaks significantly less privacy for many datapoints when trained on common benchmarks.
This implies privacy attacks will necessarily fail against many datapoints if the adversary does not have sufficient control over the possible training datasets.
arXiv Detail & Related papers (2023-07-01T11:51:56Z) - On the Efficacy of Differentially Private Few-shot Image Classification [40.49270725252068]
In many applications including personalization and federated learning, it is crucial to perform well in the few-shot setting.
We show how the accuracy and vulnerability to attack of few-shot DP image classification models are affected as the number of shots per class, privacy level, model architecture, downstream dataset, and subset of learnable parameters in the model vary.
arXiv Detail & Related papers (2023-02-02T16:16:25Z) - Private Ad Modeling with DP-SGD [58.670969449674395]
A well-known algorithm in privacy-preserving ML is differentially private gradient descent (DP-SGD)
In this work we apply DP-SGD to several ad modeling tasks including predicting click-through rates, conversion rates, and number of conversion events.
Our work is the first to empirically demonstrate that DP-SGD can provide both privacy and utility for ad modeling tasks.
arXiv Detail & Related papers (2022-11-21T22:51:16Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Robin Hood and Matthew Effects -- Differential Privacy Has Disparate
Impact on Synthetic Data [3.2345600015792564]
We analyze the impact of Differential Privacy on generative models.
We show that DP results in opposite size distributions in the generated synthetic data.
We call for caution when analyzing or training a model on synthetic data.
arXiv Detail & Related papers (2021-09-23T15:14:52Z) - DTGAN: Differential Private Training for Tabular GANs [6.174448419090292]
We propose DTGAN, a novel conditional Wasserstein GAN that comes in two variants DTGAN_G and DTGAN_D.
We rigorously evaluate the theoretical privacy guarantees offered by DP empirically against membership and attribute inference attacks.
Our results on 3 datasets show that the DP-SGD framework is superior to PATE and that a DP discriminator is more optimal for training convergence.
arXiv Detail & Related papers (2021-07-06T10:28:05Z) - DP-SGD vs PATE: Which Has Less Disparate Impact on Model Accuracy? [1.3238373064156095]
We show that application of differential privacy, specifically the DP-SGD algorithm, has a disparate impact on different sub-groups in the population.
We compare PATE, another mechanism for training deep learning models using differential privacy, with DP-SGD in terms of fairness.
arXiv Detail & Related papers (2021-06-22T20:37:12Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.