imdpGAN: Generating Private and Specific Data with Generative
Adversarial Networks
- URL: http://arxiv.org/abs/2009.13839v1
- Date: Tue, 29 Sep 2020 08:03:32 GMT
- Title: imdpGAN: Generating Private and Specific Data with Generative
Adversarial Networks
- Authors: Saurabh Gupta, Arun Balaji Buduru, Ponnurangam Kumaraguru
- Abstract summary: imdpGAN is an end-to-end framework that simultaneously achieves privacy protection and learns latent representations.
We show that imdpGAN preserves the privacy of the individual data point, and learns latent codes to control the specificity of the generated samples.
- Score: 19.377726080729293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative Adversarial Network (GAN) and its variants have shown promising
results in generating synthetic data. However, the issues with GANs are: (i)
the learning happens around the training samples and the model often ends up
remembering them, consequently, compromising the privacy of individual samples
- this becomes a major concern when GANs are applied to training data including
personally identifiable information, (ii) the randomness in generated data -
there is no control over the specificity of generated samples. To address these
issues, we propose imdpGAN - an information maximizing differentially private
Generative Adversarial Network. It is an end-to-end framework that
simultaneously achieves privacy protection and learns latent representations.
With experiments on MNIST dataset, we show that imdpGAN preserves the privacy
of the individual data point, and learns latent codes to control the
specificity of the generated samples. We perform binary classification on digit
pairs to show the utility versus privacy trade-off. The classification accuracy
decreases as we increase privacy levels in the framework. We also
experimentally show that the training process of imdpGAN is stable but
experience a 10-fold time increase as compared with other GAN frameworks.
Finally, we extend imdpGAN framework to CelebA dataset to show how the privacy
and learned representations can be used to control the specificity of the
output.
Related papers
- LLM-based Privacy Data Augmentation Guided by Knowledge Distillation
with a Distribution Tutor for Medical Text Classification [67.92145284679623]
We propose a DP-based tutor that models the noised private distribution and controls samples' generation with a low privacy cost.
We theoretically analyze our model's privacy protection and empirically verify our model.
arXiv Detail & Related papers (2024-02-26T11:52:55Z) - Federated Learning Empowered by Generative Content [55.576885852501775]
Federated learning (FL) enables leveraging distributed private data for model training in a privacy-preserving way.
We propose a novel FL framework termed FedGC, designed to mitigate data heterogeneity issues by diversifying private data with generative content.
We conduct a systematic empirical study on FedGC, covering diverse baselines, datasets, scenarios, and modalities.
arXiv Detail & Related papers (2023-12-10T07:38:56Z) - Initialization Matters: Privacy-Utility Analysis of Overparameterized
Neural Networks [72.51255282371805]
We prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets.
We find that this KL privacy bound is largely determined by the expected squared gradient norm relative to model parameters during training.
arXiv Detail & Related papers (2023-10-31T16:13:22Z) - Local Differential Privacy in Graph Neural Networks: a Reconstruction Approach [17.000441871334683]
We propose a learning framework that can provide node privacy at the user level, while incurring low utility loss.
We focus on a decentralized notion of Differential Privacy, namely Local Differential Privacy.
We develop reconstruction methods to approximate features and labels from perturbed data.
arXiv Detail & Related papers (2023-09-15T17:35:51Z) - Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge.
Existing private generative models are struggling with the utility of synthetic samples.
We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z) - Improving Correlation Capture in Generating Imbalanced Data using
Differentially Private Conditional GANs [2.2265840715792735]
We propose DP-CGANS, a differentially private conditional GAN framework consisting of data transformation, sampling, conditioning, and networks training to generate realistic and privacy-preserving data.
We extensively evaluate our model with state-of-the-art generative models on three public datasets and two real-world personal health datasets in terms of statistical similarity, machine learning performance, and privacy measurement.
arXiv Detail & Related papers (2022-06-28T06:47:27Z) - On the Privacy Properties of GAN-generated Samples [12.765060550622422]
We show that GAN-generated samples inherently satisfy some (weak) privacy guarantees.
We also study the robustness of GAN-generated samples to membership inference attacks.
arXiv Detail & Related papers (2022-06-03T00:29:35Z) - Generative Models with Information-Theoretic Protection Against
Membership Inference Attacks [6.840474688871695]
Deep generative models, such as Generative Adversarial Networks (GANs), synthesize diverse high-fidelity data samples.
GANs may disclose private information from the data they are trained on, making them susceptible to adversarial attacks.
We propose an information theoretically motivated regularization term that prevents the generative model from overfitting to training data and encourages generalizability.
arXiv Detail & Related papers (2022-05-31T19:29:55Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z) - GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially
Private Generators [74.16405337436213]
We propose Gradient-sanitized Wasserstein Generative Adrial Networks (GS-WGAN)
GS-WGAN allows releasing a sanitized form of sensitive data with rigorous privacy guarantees.
We find our approach consistently outperforms state-of-the-art approaches across multiple metrics.
arXiv Detail & Related papers (2020-06-15T10:01:01Z) - DP-CGAN: Differentially Private Synthetic Data and Label Generation [18.485995499841]
We introduce a Differentially Private Conditional GAN (DP-CGAN) training framework based on a new clipping and perturbation strategy.
We show that DP-CGAN can generate visually and empirically promising results on the MNIST dataset with a single-digit epsilon parameter in differential privacy.
arXiv Detail & Related papers (2020-01-27T11:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.