Bayesian Estimation of Differential Privacy
- URL: http://arxiv.org/abs/2206.05199v1
- Date: Fri, 10 Jun 2022 15:57:18 GMT
- Title: Bayesian Estimation of Differential Privacy
- Authors: Santiago Zanella-B\'eguelin (Microsoft Research) and Lukas Wutschitz
(Microsoft) and Shruti Tople (Microsoft Research) and Ahmed Salem (Microsoft
Research) and Victor R\"uhle (Microsoft) and Andrew Paverd (Microsoft
Research) and Mohammad Naseri (University College London) and Boris K\"opf
(Microsoft Research)
- Abstract summary: Differentially Private SGD enable training machine learning models with formal privacy guarantees.
There is a discrepancy between the protection that such algorithms guarantee in theory and the protection they afford in practice.
This paper empirically estimates the protection afforded by differentially private training as a confidence interval for the privacy budget.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Algorithms such as Differentially Private SGD enable training machine
learning models with formal privacy guarantees. However, there is a discrepancy
between the protection that such algorithms guarantee in theory and the
protection they afford in practice. An emerging strand of work empirically
estimates the protection afforded by differentially private training as a
confidence interval for the privacy budget $\varepsilon$ spent on training a
model. Existing approaches derive confidence intervals for $\varepsilon$ from
confidence intervals for the false positive and false negative rates of
membership inference attacks. Unfortunately, obtaining narrow high-confidence
intervals for $\epsilon$ using this method requires an impractically large
sample size and training as many models as samples. We propose a novel Bayesian
method that greatly reduces sample size, and adapt and validate a heuristic to
draw more than one sample per trained model. Our Bayesian method exploits the
hypothesis testing interpretation of differential privacy to obtain a posterior
for $\varepsilon$ (not just a confidence interval) from the joint posterior of
the false positive and false negative rates of membership inference attacks.
For the same sample size and confidence, we derive confidence intervals for
$\varepsilon$ around 40% narrower than prior work. The heuristic, which we
adapt from label-only DP, can be used to further reduce the number of trained
models needed to get enough samples by up to 2 orders of magnitude.
Related papers
- Resampling methods for private statistical inference [1.8110941972682346]
We consider the task of constructing confidence intervals with differential privacy.
We propose two private variants of the non-parametric bootstrap, which privately compute the median of the results of multiple "little" bootstraps run on partitions of the data.
For a fixed differential privacy parameter $epsilon$, our methods enjoy the same error rates as that of the non-private bootstrap to within logarithmic factors in the sample size $n$.
arXiv Detail & Related papers (2024-02-11T08:59:02Z) - Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning [59.44422468242455]
We propose a novel method dubbed ShrinkMatch to learn uncertain samples.
For each uncertain sample, it adaptively seeks a shrunk class space, which merely contains the original top-1 class.
We then impose a consistency regularization between a pair of strongly and weakly augmented samples in the shrunk space to strive for discriminative representations.
arXiv Detail & Related papers (2023-08-13T14:05:24Z) - Epsilon*: Privacy Metric for Machine Learning Models [7.461284823977013]
Epsilon* is a new metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy mitigation strategies.
It requires only black-box access to model predictions, does not require training data re-sampling or model re-training, and can be used to measure the privacy risk of models not trained with differential privacy.
arXiv Detail & Related papers (2023-07-21T00:49:07Z) - Differentially Private Statistical Inference through $\beta$-Divergence
One Posterior Sampling [2.8544822698499255]
We propose a posterior sampling scheme from a generalised posterior targeting the minimisation of the $beta$-divergence between the model and the data generating process.
This provides private estimation that is generally applicable without requiring changes to the underlying model.
We show that $beta$D-Bayes produces more precise inference estimation for the same privacy guarantees.
arXiv Detail & Related papers (2023-07-11T12:00:15Z) - Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Optimal Membership Inference Bounds for Adaptive Composition of Sampled
Gaussian Mechanisms [93.44378960676897]
Given a trained model and a data sample, membership-inference (MI) attacks predict whether the sample was in the model's training set.
A common countermeasure against MI attacks is to utilize differential privacy (DP) during model training to mask the presence of individual examples.
In this paper, we derive bounds for the textitadvantage of an adversary mounting a MI attack, and demonstrate tightness for the widely-used Gaussian mechanism.
arXiv Detail & Related papers (2022-04-12T22:36:56Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - On the Intrinsic Differential Privacy of Bagging [69.70602220716718]
We show that Bagging achieves significantly higher accuracies than state-of-the-art differentially private machine learning methods with the same privacy budgets.
Our experimental results demonstrate that Bagging achieves significantly higher accuracies than state-of-the-art differentially private machine learning methods with the same privacy budgets.
arXiv Detail & Related papers (2020-08-22T14:17:55Z) - Parametric Bootstrap for Differentially Private Confidence Intervals [8.781431682774484]
We develop a practical and general-purpose approach to construct confidence intervals for differentially private parametric estimation.
We find that the parametric bootstrap is a simple and effective solution.
arXiv Detail & Related papers (2020-06-14T00:08:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.