Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence
- URL: http://arxiv.org/abs/2111.01177v1
- Date: Mon, 1 Nov 2021 18:10:21 GMT
- Title: Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence
- Authors: Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis
- Abstract summary: We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
- Score: 73.14373832423156
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although machine learning models trained on massive data have led to
break-throughs in several areas, their deployment in privacy-sensitive domains
remains limited due to restricted access to data. Generative models trained
with privacy constraints on private data can sidestep this challenge, providing
indirect access to private data instead. We propose DP-Sinkhorn, a novel
optimal transport-based generative method for learning data distributions from
private data with differential privacy. DP-Sinkhorn minimizes the Sinkhorn
divergence, a computationally efficient approximation to the exact optimal
transport distance, between the model and data in a differentially private
manner and uses a novel technique for control-ling the bias-variance trade-off
of gradient estimates. Unlike existing approaches for training differentially
private generative models, which are mostly based on generative adversarial
networks, we do not rely on adversarial objectives, which are notoriously
difficult to optimize, especially in the presence of noise imposed by privacy
constraints. Hence, DP-Sinkhorn is easy to train and deploy. Experimentally, we
improve upon the state-of-the-art on multiple image modeling benchmarks and
show differentially private synthesis of informative RGB images. Project
page:https://nv-tlabs.github.io/DP-Sinkhorn.
Related papers
- Beyond the Mean: Differentially Private Prototypes for Private Transfer Learning [16.028575596905554]
We propose Differentially Private Prototype Learning (DPPL) as a new paradigm for private transfer learning.
DPPL generates prototypes that represent each private class in the embedding space and can be publicly released for inference.
We show that privacy-utility trade-offs can be further improved when leveraging the public data beyond pre-training of the encoder.
arXiv Detail & Related papers (2024-06-12T09:41:12Z) - Private Fine-tuning of Large Language Models with Zeroth-order
Optimization [54.24600476755372]
We introduce DP-ZO, a new method for fine-tuning large language models that preserves the privacy of training data by privatizing zeroth-order optimization.
We show that DP-ZO exhibits just $1.86%$ performance degradation due to privacy at $ (1,10-5)$-DP when fine-tuning OPT-66B on 1000 training samples from SQuAD.
arXiv Detail & Related papers (2024-01-09T03:53:59Z) - Learning Differentially Private Probabilistic Models for
Privacy-Preserving Image Generation [67.47979276739144]
We propose learning differentially private probabilistic models to generate high-resolution images with differential privacy guarantee.
Our approach can generate images up to 256x256 with remarkable visual quality and data utility.
arXiv Detail & Related papers (2023-05-18T02:51:17Z) - Differentially Private Synthetic Data Generation via
Lipschitz-Regularised Variational Autoencoders [3.7463972693041274]
It is often overlooked that generative models are prone to memorising many details of individual training records.
In this paper we explore an alternative approach for privately generating data that makes direct use of the inherentity in generative models.
arXiv Detail & Related papers (2023-04-22T07:24:56Z) - Pre-trained Perceptual Features Improve Differentially Private Image
Generation [8.659595986100738]
Training even moderately-sized generative models with differentially-private descent gradient (DP-SGD) is difficult.
We advocate building off a good, relevant representation on an informative public dataset, then learning to model the private data with that representation.
Our work introduces simple yet powerful foundations for reducing the gap between private and non-private deep generative models.
arXiv Detail & Related papers (2022-05-25T16:46:01Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - DPlis: Boosting Utility of Differentially Private Deep Learning via
Randomized Smoothing [0.0]
We propose DPlis--Differentially Private Learning wIth Smoothing.
We show that DPlis can effectively boost model quality and training stability under a given privacy budget.
arXiv Detail & Related papers (2021-03-02T06:33:14Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.