Do Not Trust Prediction Scores for Membership Inference Attacks
- URL: http://arxiv.org/abs/2111.09076v1
- Date: Wed, 17 Nov 2021 12:39:04 GMT
- Title: Do Not Trust Prediction Scores for Membership Inference Attacks
- Authors: Dominik Hintersdorf, Lukas Struppek, Kristian Kersting
- Abstract summary: Membership inference attacks (MIAs) aim to determine whether a specific sample was used to train a predictive model.
We argue that this is a fallacy for many modern deep network architectures.
We are able to produce a potentially infinite number of samples falsely classified as part of the training data.
- Score: 15.567057178736402
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Membership inference attacks (MIAs) aim to determine whether a specific
sample was used to train a predictive model. Knowing this may indeed lead to a
privacy breach. Arguably, most MIAs, however, make use of the model's
prediction scores - the probability of each output given some input - following
the intuition that the trained model tends to behave differently on its
training data. We argue that this is a fallacy for many modern deep network
architectures, e.g., ReLU type neural networks produce almost always high
prediction scores far away from the training data. Consequently, MIAs will
miserably fail since this behavior leads to high false-positive rates not only
on known domains but also on out-of-distribution data and implicitly acts as a
defense against MIAs. Specifically, using generative adversarial networks, we
are able to produce a potentially infinite number of samples falsely classified
as part of the training data. In other words, the threat of MIAs is
overestimated and less information is leaked than previously assumed. Moreover,
there is actually a trade-off between the overconfidence of classifiers and
their susceptibility to MIAs: the more classifiers know when they do not know,
making low confidence predictions far away from the training data, the more
they reveal the training data.
Related papers
- Confidence Is All You Need for MI Attacks [7.743155804758186]
We propose a new method to gauge a data point's membership in a model's training set.
During training, the model is essentially being 'fit' to the training data and might face particular difficulties in generalization to unseen data.
arXiv Detail & Related papers (2023-11-26T18:09:24Z) - When Fairness Meets Privacy: Exploring Privacy Threats in Fair Binary Classifiers via Membership Inference Attacks [17.243744418309593]
We propose an efficient MIA method against fairness-enhanced models based on fairness discrepancy results.
We also explore potential strategies for mitigating privacy leakages.
arXiv Detail & Related papers (2023-11-07T10:28:17Z) - Overconfidence is a Dangerous Thing: Mitigating Membership Inference
Attacks by Enforcing Less Confident Prediction [2.2336243882030025]
Machine learning models are vulnerable to membership inference attacks (MIAs)
This work proposes a defense technique, HAMP, that can achieve both strong membership privacy and high accuracy, without requiring extra data.
arXiv Detail & Related papers (2023-07-04T09:50:33Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - TrustGAN: Training safe and trustworthy deep learning models through
generative adversarial networks [0.0]
We present TrustGAN, a generative adversarial network pipeline targeting trustness.
The pipeline can accept any given deep learning model which outputs a prediction and a confidence on this prediction.
It is applied here to a target classification model trained on MNIST data to recognise numbers based on images.
arXiv Detail & Related papers (2022-11-25T09:57:23Z) - ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference [54.17205151960878]
We introduce a sampling-free approach that is generic and easy to deploy.
We produce reliable uncertainty estimates on par with state-of-the-art methods at a significantly lower computational cost.
arXiv Detail & Related papers (2022-11-21T13:23:09Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Investigating Membership Inference Attacks under Data Dependencies [26.70764798408236]
Training machine learning models on privacy-sensitive data has opened the door to new attacks that can have serious privacy implications.
One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model.
We evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed.
arXiv Detail & Related papers (2020-10-23T00:16:46Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.