Related papers: Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning

Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning

URL: http://arxiv.org/abs/2101.04535v1
Date: Mon, 11 Jan 2021 18:47:11 GMT
Title: Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning
Authors: Milad Nasr, Shuang Song, Abhradeep Thakurta, Nicolas Papernot and Nicholas Carlini
Abstract summary: Differentially private (DP) machine learning allows us to train models on private data while limiting data leakage. In this paper, we evaluate the importance of the adversary capabilities allowed in the privacy analysis of DP training algorithms.
Score: 43.6041475698327
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Differentially private (DP) machine learning allows us to train models on private data while limiting data leakage. DP formalizes this data leakage through a cryptographic game, where an adversary must predict if a model was trained on a dataset D, or a dataset D' that differs in just one example.If observing the training algorithm does not meaningfully increase the adversary's odds of successfully guessing which dataset the model was trained on, then the algorithm is said to be differentially private. Hence, the purpose of privacy analysis is to upper bound the probability that any adversary could successfully guess which dataset the model was trained on.In our paper, we instantiate this hypothetical adversary in order to establish lower bounds on the probability that this distinguishing game can be won. We use this adversary to evaluate the importance of the adversary capabilities allowed in the privacy analysis of DP training algorithms.For DP-SGD, the most common method for training neural networks with differential privacy, our lower bounds are tight and match the theoretical upper bound. This implies that in order to prove better upper bounds, it will be necessary to make use of additional assumptions. Fortunately, we find that our attacks are significantly weaker when additional (realistic)restrictions are put in place on the adversary's capabilities.Thus, in the practical setting common to many real-world deployments, there is a gap between our lower bounds and the upper bounds provided by the analysis: differential privacy is conservative and adversaries may not be able to leak as much information as suggested by the theoretical bound.

Related papers

The Limits of Differential Privacy in Online Learning [11.099792269219124]
We present evidence that separates three types of constraints: no DP, pure DP, and approximate DP. We first describe a hypothesis class that is online learnable under approximate DP but not online learnable under pure DP under the adaptive adversarial setting. We then prove that any private online learner must make an infinite number of mistakes for almost all hypothesis classes.
arXiv Detail & Related papers (2024-11-08T11:21:31Z)
Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner. Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z)
Closed-Form Bounds for DP-SGD against Record-level Inference [18.85865832127335]
We focus on the popular DP-SGD algorithm, and derive simple closed-form bounds. We obtain bounds for membership inference that match state-of-the-art techniques. We present a novel data-dependent bound against attribute inference.
arXiv Detail & Related papers (2024-02-22T09:26:16Z)
Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD [44.11069254181353]
We show that DP-SGD leaks significantly less privacy for many datapoints when trained on common benchmarks. This implies privacy attacks will necessarily fail against many datapoints if the adversary does not have sufficient control over the possible training datasets.
arXiv Detail & Related papers (2023-07-01T11:51:56Z)
Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing. We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z)
Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD. We find that most examples enjoy stronger privacy guarantees than the worst-case bound. This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z)
Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy. Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z)
Quantifying identifiability to choose and audit $\epsilon$ in differentially private deep learning [15.294433619347082]
To use differential privacy in machine learning, data scientists must choose privacy parameters $(epsilon,delta)$. We transform $(epsilon,delta)$ to a bound on the Bayesian posterior belief of the adversary assumed by differential privacy concerning the presence of any record in the training dataset. We formulate an implementation of this differential privacy adversary that allows data scientists to audit model training and compute empirical identifiability scores and empirical $(epsilon,delta)$.
arXiv Detail & Related papers (2021-03-04T09:35:58Z)
User-Level Privacy-Preserving Federated Learning: Analysis and Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models. From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs. We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.