Training generative models from privatized data
- URL: http://arxiv.org/abs/2306.09547v2
- Date: Fri, 1 Mar 2024 01:54:15 GMT
- Title: Training generative models from privatized data
- Authors: Daria Reshetova, Wei-Ning Chen, Ayfer \"Ozg\"ur
- Abstract summary: Local differential privacy is a powerful method for privacy-preserving data collection.
We develop a framework for training Generative Adversarial Networks (GANs) on differentially privatized data.
- Score: 9.584000954415476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Local differential privacy is a powerful method for privacy-preserving data
collection. In this paper, we develop a framework for training Generative
Adversarial Networks (GANs) on differentially privatized data. We show that
entropic regularization of optimal transport - a popular regularization method
in the literature that has often been leveraged for its computational benefits
- enables the generator to learn the raw (unprivatized) data distribution even
though it only has access to privatized samples. We prove that at the same time
this leads to fast statistical convergence at the parametric rate. This shows
that entropic regularization of optimal transport uniquely enables the
mitigation of both the effects of privatization noise and the curse of
dimensionality in statistical convergence. We provide experimental evidence to
support the efficacy of our framework in practice.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Conditional Density Estimations from Privacy-Protected Data [0.0]
We propose simulation-based inference methods from privacy-protected datasets.
We illustrate our methods on discrete time-series data under an infectious disease model and with ordinary linear regression models.
arXiv Detail & Related papers (2023-10-19T14:34:17Z) - Enforcing Privacy in Distributed Learning with Performance Guarantees [57.14673504239551]
We study the privatization of distributed learning and optimization strategies.
We show that the popular additive random perturbation scheme degrades performance because it is not well-tuned to the graph structure.
arXiv Detail & Related papers (2023-01-16T13:03:27Z) - Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge.
Existing private generative models are struggling with the utility of synthetic samples.
We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Privacy for Free: How does Dataset Condensation Help Privacy? [21.418263507735684]
We identify that dataset condensation (DC) is also a better solution to replace the traditional data generators for private data generation.
We empirically validate the visual privacy and membership privacy of DC-synthesized data by launching both the loss-based and the state-of-the-art likelihood-based membership inference attacks.
arXiv Detail & Related papers (2022-06-01T05:39:57Z) - Distribution-Invariant Differential Privacy [4.700764053354502]
We develop a distribution-invariant privatization (DIP) method to reconcile high statistical accuracy and strict differential privacy.
Under the same strictness of privacy protection, DIP achieves superior statistical accuracy in two simulations and on three real-world benchmarks.
arXiv Detail & Related papers (2021-11-08T22:26:50Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.