Quantifying identifiability to choose and audit $\epsilon$ in
differentially private deep learning
- URL: http://arxiv.org/abs/2103.02913v2
- Date: Fri, 5 Mar 2021 21:22:28 GMT
- Title: Quantifying identifiability to choose and audit $\epsilon$ in
differentially private deep learning
- Authors: Daniel Bernau, G\"unther Eibl, Philip W. Grassal, Hannah Keller,
Florian Kerschbaum
- Abstract summary: To use differential privacy in machine learning, data scientists must choose privacy parameters $(epsilon,delta)$.
We transform $(epsilon,delta)$ to a bound on the Bayesian posterior belief of the adversary assumed by differential privacy concerning the presence of any record in the training dataset.
We formulate an implementation of this differential privacy adversary that allows data scientists to audit model training and compute empirical identifiability scores and empirical $(epsilon,delta)$.
- Score: 15.294433619347082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differential privacy allows bounding the influence that training data records
have on a machine learning model. To use differential privacy in machine
learning, data scientists must choose privacy parameters $(\epsilon,\delta)$.
Choosing meaningful privacy parameters is key since models trained with weak
privacy parameters might result in excessive privacy leakage, while strong
privacy parameters might overly degrade model utility. However, privacy
parameter values are difficult to choose for two main reasons. First, the upper
bound on privacy loss $(\epsilon,\delta)$ might be loose, depending on the
chosen sensitivity and data distribution of practical datasets. Second, legal
requirements and societal norms for anonymization often refer to individual
identifiability, to which $(\epsilon,\delta)$ are only indirectly related.
We transform $(\epsilon,\delta)$ to a bound on the Bayesian posterior belief
of the adversary assumed by differential privacy concerning the presence of any
record in the training dataset. The bound holds for multidimensional queries
under composition, and we show that it can be tight in practice. Furthermore,
we derive an identifiability bound, which relates the adversary assumed in
differential privacy to previous work on membership inference adversaries. We
formulate an implementation of this differential privacy adversary that allows
data scientists to audit model training and compute empirical identifiability
scores and empirical $(\epsilon,\delta)$.
Related papers
- Calibrating Practical Privacy Risks for Differentially Private Machine Learning [5.363664265121231]
We study the approaches that can lower the attacking success rate to allow for more flexible privacy budget settings in model training.
We find that by selectively suppressing privacy-sensitive features, we can achieve lower ASR values without compromising application-specific data utility.
arXiv Detail & Related papers (2024-10-30T03:52:01Z) - Epsilon*: Privacy Metric for Machine Learning Models [7.461284823977013]
Epsilon* is a new metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy mitigation strategies.
It requires only black-box access to model predictions, does not require training data re-sampling or model re-training, and can be used to measure the privacy risk of models not trained with differential privacy.
arXiv Detail & Related papers (2023-07-21T00:49:07Z) - Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight.
They only give tight estimates under implausible worst-case assumptions.
We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z) - Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Personalized PATE: Differential Privacy for Machine Learning with
Individual Privacy Guarantees [1.2691047660244335]
We propose three novel methods to support training an ML model with different personalized privacy guarantees within the training data.
Our experiments show that our personalized privacy methods yield higher accuracy models than the non-personalized baseline.
arXiv Detail & Related papers (2022-02-21T20:16:27Z) - Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions.
We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.