Related papers: Do Not Trust Prediction Scores for Membership Inference Attacks

Do Not Trust Prediction Scores for Membership Inference Attacks

URL: http://arxiv.org/abs/2111.09076v1
Date: Wed, 17 Nov 2021 12:39:04 GMT
Title: Do Not Trust Prediction Scores for Membership Inference Attacks
Authors: Dominik Hintersdorf, Lukas Struppek, Kristian Kersting
Abstract summary: Membership inference attacks (MIAs) aim to determine whether a specific sample was used to train a predictive model. We argue that this is a fallacy for many modern deep network architectures. We are able to produce a potentially infinite number of samples falsely classified as part of the training data.
Score: 15.567057178736402
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Membership inference attacks (MIAs) aim to determine whether a specific sample was used to train a predictive model. Knowing this may indeed lead to a privacy breach. Arguably, most MIAs, however, make use of the model's prediction scores - the probability of each output given some input - following the intuition that the trained model tends to behave differently on its training data. We argue that this is a fallacy for many modern deep network architectures, e.g., ReLU type neural networks produce almost always high prediction scores far away from the training data. Consequently, MIAs will miserably fail since this behavior leads to high false-positive rates not only on known domains but also on out-of-distribution data and implicitly acts as a defense against MIAs. Specifically, using generative adversarial networks, we are able to produce a potentially infinite number of samples falsely classified as part of the training data. In other words, the threat of MIAs is overestimated and less information is leaked than previously assumed. Moreover, there is actually a trade-off between the overconfidence of classifiers and their susceptibility to MIAs: the more classifiers know when they do not know, making low confidence predictions far away from the training data, the more they reveal the training data.

Related papers

Confidence Is All You Need for MI Attacks [7.743155804758186]
We propose a new method to gauge a data point's membership in a model's training set. During training, the model is essentially being 'fit' to the training data and might face particular difficulties in generalization to unseen data.
arXiv Detail & Related papers (2023-11-26T18:09:24Z)
When Fairness Meets Privacy: Exploring Privacy Threats in Fair Binary Classifiers via Membership Inference Attacks [17.243744418309593]
We propose an efficient MIA method against fairness-enhanced models based on fairness discrepancy results. We also explore potential strategies for mitigating privacy leakages.
arXiv Detail & Related papers (2023-11-07T10:28:17Z)
Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction [2.2336243882030025]
Machine learning models are vulnerable to membership inference attacks (MIAs) This work proposes a defense technique, HAMP, that can achieve both strong membership privacy and high accuracy, without requiring extra data.
arXiv Detail & Related papers (2023-07-04T09:50:33Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Membership Inference Attacks against Synthetic Data through Overfitting Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution. We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z)
TrustGAN: Training safe and trustworthy deep learning models through generative adversarial networks [0.0]
We present TrustGAN, a generative adversarial network pipeline targeting trustness. The pipeline can accept any given deep learning model which outputs a prediction and a confidence on this prediction. It is applied here to a target classification model trained on MNIST data to recognise numbers based on images.
arXiv Detail & Related papers (2022-11-25T09:57:23Z)
ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference [54.17205151960878]
We introduce a sampling-free approach that is generic and easy to deploy. We produce reliable uncertainty estimates on par with state-of-the-art methods at a significantly lower computational cost.
arXiv Detail & Related papers (2022-11-21T13:23:09Z)
RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target. RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead. Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z)
Investigating Membership Inference Attacks under Data Dependencies [26.70764798408236]
Training machine learning models on privacy-sensitive data has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model. We evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed.
arXiv Detail & Related papers (2020-10-23T00:16:46Z)
Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters. We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data. Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.