On the Difficulty of Membership Inference Attacks
- URL: http://arxiv.org/abs/2005.13702v3
- Date: Mon, 22 Mar 2021 20:17:22 GMT
- Title: On the Difficulty of Membership Inference Attacks
- Authors: Shahbaz Rezaei and Xin Liu
- Abstract summary: Recent studies propose membership inference (MI) attacks on deep models.
Despite their apparent success, these studies only report accuracy, precision, and recall of the positive class (member class)
We show that the way the MI attack performance has been reported is often misleading because they suffer from high false positive rate or false alarm rate (FAR) that has not been reported.
- Score: 11.172550334631921
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies propose membership inference (MI) attacks on deep models,
where the goal is to infer if a sample has been used in the training process.
Despite their apparent success, these studies only report accuracy, precision,
and recall of the positive class (member class). Hence, the performance of
these attacks have not been clearly reported on negative class (non-member
class). In this paper, we show that the way the MI attack performance has been
reported is often misleading because they suffer from high false positive rate
or false alarm rate (FAR) that has not been reported. FAR shows how often the
attack model mislabel non-training samples (non-member) as training (member)
ones. The high FAR makes MI attacks fundamentally impractical, which is
particularly more significant for tasks such as membership inference where the
majority of samples in reality belong to the negative (non-training) class.
Moreover, we show that the current MI attack models can only identify the
membership of misclassified samples with mediocre accuracy at best, which only
constitute a very small portion of training samples.
We analyze several new features that have not been comprehensively explored
for membership inference before, including distance to the decision boundary
and gradient norms, and conclude that deep models' responses are mostly similar
among train and non-train samples. We conduct several experiments on image
classification tasks, including MNIST, CIFAR-10, CIFAR-100, and ImageNet, using
various model architecture, including LeNet, AlexNet, ResNet, etc. We show that
the current state-of-the-art MI attacks cannot achieve high accuracy and low
FAR at the same time, even when the attacker is given several advantages.
The source code is available at https://github.com/shrezaei/MI-Attack.
Related papers
- Membership Inference Attacks on Diffusion Models via Quantile Regression [30.30033625685376]
We demonstrate a privacy vulnerability of diffusion models through amembership inference (MI) attack.
Our proposed MI attack learns quantile regression models that predict (a quantile of) the distribution of reconstruction loss on examples not used in training.
We show that our attack outperforms the prior state-of-the-art attack while being substantially less computationally expensive.
arXiv Detail & Related papers (2023-12-08T16:21:24Z) - MIST: Defending Against Membership Inference Attacks Through Membership-Invariant Subspace Training [20.303439793769854]
Member Inference (MI) attacks are a major privacy concern when using private data to train machine learning (ML) models.
We introduce a novel Membership-Invariant Subspace Training (MIST) method to defend against MI attacks.
arXiv Detail & Related papers (2023-11-02T01:25:49Z) - Defending Pre-trained Language Models as Few-shot Learners against
Backdoor Attacks [72.03945355787776]
We advocate MDP, a lightweight, pluggable, and effective defense for PLMs as few-shot learners.
We show analytically that MDP creates an interesting dilemma for the attacker to choose between attack effectiveness and detection evasiveness.
arXiv Detail & Related papers (2023-09-23T04:41:55Z) - On the Discredibility of Membership Inference Attacks [11.172550334631921]
Membership inference attacks are proposed to determine if a sample was part of the training set or not.
We show that MI models frequently misclassify neighboring nonmember samples of a member sample as members.
We argue that current membership inference attacks can identify memorized subpopulations, but they cannot reliably identify which exact sample in the subpopulation was used during the training.
arXiv Detail & Related papers (2022-12-06T01:48:27Z) - NeFSAC: Neurally Filtered Minimal Samples [90.55214606751453]
NeFSAC is an efficient algorithm for neural filtering of motion-inconsistent and poorly-conditioned minimal samples.
NeFSAC can be plugged into any existing RANSAC-based pipeline.
We tested NeFSAC on more than 100k image pairs from three publicly available real-world datasets.
arXiv Detail & Related papers (2022-07-16T08:02:05Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - An Efficient Subpopulation-based Membership Inference Attack [11.172550334631921]
We introduce a fundamentally different MI attack approach which obviates the need to train hundreds of shadow models.
We achieve the state-of-the-art membership inference accuracy while significantly reducing the training cost.
arXiv Detail & Related papers (2022-03-04T00:52:06Z) - "What's in the box?!": Deflecting Adversarial Attacks by Randomly
Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models.
We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - Adversarial examples are useful too! [47.64219291655723]
I propose a new method to tell whether a model has been subject to a backdoor attack.
The idea is to generate adversarial examples, targeted or untargeted, using conventional attacks such as FGSM.
It is possible to visually locate the perturbed regions and unveil the attack.
arXiv Detail & Related papers (2020-05-13T01:38:56Z) - Membership Inference Attacks and Defenses in Classification Models [19.498313593713043]
We study the membership inference (MI) attack against classifiers.
We find that a model's vulnerability to MI attacks is tightly related to the generalization gap.
We propose a defense against MI attacks that aims to close the gap by intentionally reducing the training accuracy.
arXiv Detail & Related papers (2020-02-27T12:35:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.