Systematic Evaluation of Privacy Risks of Machine Learning Models
- URL: http://arxiv.org/abs/2003.10595v2
- Date: Wed, 9 Dec 2020 18:56:31 GMT
- Title: Systematic Evaluation of Privacy Risks of Machine Learning Models
- Authors: Liwei Song, Prateek Mittal
- Abstract summary: We show that prior work on membership inference attacks may severely underestimate the privacy risks.
We first propose to benchmark membership inference privacy risks by improving existing non-neural network based inference attacks.
We then introduce a new approach for fine-grained privacy analysis by formulating and deriving a new metric called the privacy risk score.
- Score: 41.017707772150835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models are prone to memorizing sensitive data, making them
vulnerable to membership inference attacks in which an adversary aims to guess
if an input sample was used to train the model. In this paper, we show that
prior work on membership inference attacks may severely underestimate the
privacy risks by relying solely on training custom neural network classifiers
to perform attacks and focusing only on the aggregate results over data
samples, such as the attack accuracy. To overcome these limitations, we first
propose to benchmark membership inference privacy risks by improving existing
non-neural network based inference attacks and proposing a new inference attack
method based on a modification of prediction entropy. We also propose
benchmarks for defense mechanisms by accounting for adaptive adversaries with
knowledge of the defense and also accounting for the trade-off between model
accuracy and privacy risks. Using our benchmark attacks, we demonstrate that
existing defense approaches are not as effective as previously reported.
Next, we introduce a new approach for fine-grained privacy analysis by
formulating and deriving a new metric called the privacy risk score. Our
privacy risk score metric measures an individual sample's likelihood of being a
training member, which allows an adversary to identify samples with high
privacy risks and perform attacks with high confidence. We experimentally
validate the effectiveness of the privacy risk score and demonstrate that the
distribution of privacy risk score across individual samples is heterogeneous.
Finally, we perform an in-depth investigation for understanding why certain
samples have high privacy risks, including correlations with model sensitivity,
generalization error, and feature embeddings. Our work emphasizes the
importance of a systematic and rigorous evaluation of privacy risks of machine
learning models.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features.
backdoor attacks subtly embed malicious behaviors within the model during training.
We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z) - Improved Membership Inference Attacks Against Language Classification Models [0.0]
We present a novel framework for running membership inference attacks against classification models.
We show that this approach achieves higher accuracy than either a single attack model or an attack model per class label.
arXiv Detail & Related papers (2023-10-11T06:09:48Z) - On the Privacy Effect of Data Enhancement via the Lens of Memorization [20.63044895680223]
We propose to investigate privacy from a new perspective called memorization.
Through the lens of memorization, we find that previously deployed MIAs produce misleading results as they are less likely to identify samples with higher privacy risks.
We demonstrate that the generalization gap and privacy leakage are less correlated than those of the previous results.
arXiv Detail & Related papers (2022-08-17T13:02:17Z) - Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set.
We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance.
Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z) - Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions.
We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z) - On Primes, Log-Loss Scores and (No) Privacy [8.679020335206753]
In this paper, we prove that this additional information enables the adversary to infer the membership of any number of datapoints with full accuracy in a single query.
Our approach obviates any attack model training or access to side knowledge with the adversary.
arXiv Detail & Related papers (2020-09-17T23:35:12Z) - MACE: A Flexible Framework for Membership Privacy Estimation in
Generative Models [14.290199072565162]
We propose the first formal framework for membership privacy estimation in generative models.
Compared to previous works, our framework makes more realistic and flexible assumptions.
arXiv Detail & Related papers (2020-09-11T23:15:05Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.