Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano
- URL: http://arxiv.org/abs/2210.13662v2
- Date: Thu, 10 Aug 2023 03:02:21 GMT
- Title: Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano
- Authors: Chuan Guo, Alexandre Sablayrolles, Maziar Sanjabi
- Abstract summary: We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
- Score: 83.5933307263932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differential privacy (DP) is by far the most widely accepted framework for
mitigating privacy risks in machine learning. However, exactly how small the
privacy parameter $\epsilon$ needs to be to protect against certain privacy
risks in practice is still not well-understood. In this work, we study data
reconstruction attacks for discrete data and analyze it under the framework of
multiple hypothesis testing. We utilize different variants of the celebrated
Fano's inequality to derive upper bounds on the inferential power of a data
reconstruction adversary when the model is trained differentially privately.
Importantly, we show that if the underlying private data takes values from a
set of size $M$, then the target privacy parameter $\epsilon$ can be $O(\log
M)$ before the adversary gains significant inferential power. Our analysis
offers theoretical evidence for the empirical effectiveness of DP against data
reconstruction attacks even at relatively large values of $\epsilon$.
Related papers
- Why Does Differential Privacy with Large Epsilon Defend Against
Practical Membership Inference Attacks? [19.21246519924815]
For small privacy parameter $epsilon$, $epsilon$-differential privacy (DP) provides a strong worst-case guarantee.
Existing DP theory cannot explain empirical findings.
We propose a new privacy notion called practical membership privacy (PMP)
arXiv Detail & Related papers (2024-02-14T19:31:45Z) - Epsilon*: Privacy Metric for Machine Learning Models [7.461284823977013]
Epsilon* is a new metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy mitigation strategies.
It requires only black-box access to model predictions, does not require training data re-sampling or model re-training, and can be used to measure the privacy risk of models not trained with differential privacy.
arXiv Detail & Related papers (2023-07-21T00:49:07Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Optimal Membership Inference Bounds for Adaptive Composition of Sampled
Gaussian Mechanisms [93.44378960676897]
Given a trained model and a data sample, membership-inference (MI) attacks predict whether the sample was in the model's training set.
A common countermeasure against MI attacks is to utilize differential privacy (DP) during model training to mask the presence of individual examples.
In this paper, we derive bounds for the textitadvantage of an adversary mounting a MI attack, and demonstrate tightness for the widely-used Gaussian mechanism.
arXiv Detail & Related papers (2022-04-12T22:36:56Z) - Quantifying identifiability to choose and audit $\epsilon$ in
differentially private deep learning [15.294433619347082]
To use differential privacy in machine learning, data scientists must choose privacy parameters $(epsilon,delta)$.
We transform $(epsilon,delta)$ to a bound on the Bayesian posterior belief of the adversary assumed by differential privacy concerning the presence of any record in the training dataset.
We formulate an implementation of this differential privacy adversary that allows data scientists to audit model training and compute empirical identifiability scores and empirical $(epsilon,delta)$.
arXiv Detail & Related papers (2021-03-04T09:35:58Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Adversary Instantiation: Lower Bounds for Differentially Private Machine
Learning [43.6041475698327]
Differentially private (DP) machine learning allows us to train models on private data while limiting data leakage.
In this paper, we evaluate the importance of the adversary capabilities allowed in the privacy analysis of DP training algorithms.
arXiv Detail & Related papers (2021-01-11T18:47:11Z) - On the Intrinsic Differential Privacy of Bagging [69.70602220716718]
We show that Bagging achieves significantly higher accuracies than state-of-the-art differentially private machine learning methods with the same privacy budgets.
Our experimental results demonstrate that Bagging achieves significantly higher accuracies than state-of-the-art differentially private machine learning methods with the same privacy budgets.
arXiv Detail & Related papers (2020-08-22T14:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.