Membership Inference Attacks and Defenses in Classification Models
- URL: http://arxiv.org/abs/2002.12062v3
- Date: Wed, 3 Feb 2021 01:43:20 GMT
- Title: Membership Inference Attacks and Defenses in Classification Models
- Authors: Jiacheng Li, Ninghui Li, Bruno Ribeiro
- Abstract summary: We study the membership inference (MI) attack against classifiers.
We find that a model's vulnerability to MI attacks is tightly related to the generalization gap.
We propose a defense against MI attacks that aims to close the gap by intentionally reducing the training accuracy.
- Score: 19.498313593713043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the membership inference (MI) attack against classifiers, where the
attacker's goal is to determine whether a data instance was used for training
the classifier. Through systematic cataloging of existing MI attacks and
extensive experimental evaluations of them, we find that a model's
vulnerability to MI attacks is tightly related to the generalization gap -- the
difference between training accuracy and test accuracy. We then propose a
defense against MI attacks that aims to close the gap by intentionally reduces
the training accuracy. More specifically, the training process attempts to
match the training and validation accuracies, by means of a new {\em set
regularizer} using the Maximum Mean Discrepancy between the softmax output
empirical distributions of the training and validation sets. Our experimental
results show that combining this approach with another simple defense (mix-up
training) significantly improves state-of-the-art defense against MI attacks,
with minimal impact on testing accuracy.
Related papers
- Perturbation-Invariant Adversarial Training for Neural Ranking Models:
Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents.
This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs.
In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z) - MIST: Defending Against Membership Inference Attacks Through Membership-Invariant Subspace Training [20.303439793769854]
Member Inference (MI) attacks are a major privacy concern when using private data to train machine learning (ML) models.
We introduce a novel Membership-Invariant Subspace Training (MIST) method to defend against MI attacks.
arXiv Detail & Related papers (2023-11-02T01:25:49Z) - Defending Pre-trained Language Models as Few-shot Learners against
Backdoor Attacks [72.03945355787776]
We advocate MDP, a lightweight, pluggable, and effective defense for PLMs as few-shot learners.
We show analytically that MDP creates an interesting dilemma for the attacker to choose between attack effectiveness and detection evasiveness.
arXiv Detail & Related papers (2023-09-23T04:41:55Z) - Overconfidence is a Dangerous Thing: Mitigating Membership Inference
Attacks by Enforcing Less Confident Prediction [2.2336243882030025]
Machine learning models are vulnerable to membership inference attacks (MIAs)
This work proposes a defense technique, HAMP, that can achieve both strong membership privacy and high accuracy, without requiring extra data.
arXiv Detail & Related papers (2023-07-04T09:50:33Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Membership Inference Attacks by Exploiting Loss Trajectory [19.900473800648243]
We propose a new attack method, called system, which can exploit the membership information from the whole training process of the target model.
Our attack achieves at least 6$times$ higher true-positive rate at a low false-positive rate of 0.1% than existing methods.
arXiv Detail & Related papers (2022-08-31T16:02:26Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - Improving Robustness to Model Inversion Attacks via Mutual Information
Regularization [12.079281416410227]
This paper studies defense mechanisms against model inversion (MI) attacks.
MI is a type of privacy attacks aimed at inferring information about the training data distribution given the access to a target machine learning model.
We propose the Mutual Information Regularization based Defense (MID) against MI attacks.
arXiv Detail & Related papers (2020-09-11T06:02:44Z) - On the Difficulty of Membership Inference Attacks [11.172550334631921]
Recent studies propose membership inference (MI) attacks on deep models.
Despite their apparent success, these studies only report accuracy, precision, and recall of the positive class (member class)
We show that the way the MI attack performance has been reported is often misleading because they suffer from high false positive rate or false alarm rate (FAR) that has not been reported.
arXiv Detail & Related papers (2020-05-27T23:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.