Related papers: Membership Inference Attacks and Defenses in Classification Models

Membership Inference Attacks and Defenses in Classification Models

URL: http://arxiv.org/abs/2002.12062v3
Date: Wed, 3 Feb 2021 01:43:20 GMT
Title: Membership Inference Attacks and Defenses in Classification Models
Authors: Jiacheng Li, Ninghui Li, Bruno Ribeiro
Abstract summary: We study the membership inference (MI) attack against classifiers. We find that a model's vulnerability to MI attacks is tightly related to the generalization gap. We propose a defense against MI attacks that aims to close the gap by intentionally reducing the training accuracy.
Score: 19.498313593713043
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the membership inference (MI) attack against classifiers, where the attacker's goal is to determine whether a data instance was used for training the classifier. Through systematic cataloging of existing MI attacks and extensive experimental evaluations of them, we find that a model's vulnerability to MI attacks is tightly related to the generalization gap -- the difference between training accuracy and test accuracy. We then propose a defense against MI attacks that aims to close the gap by intentionally reduces the training accuracy. More specifically, the training process attempts to match the training and validation accuracies, by means of a new {\em set regularizer} using the Maximum Mean Discrepancy between the softmax output empirical distributions of the training and validation sets. Our experimental results show that combining this approach with another simple defense (mix-up training) significantly improves state-of-the-art defense against MI attacks, with minimal impact on testing accuracy.

Related papers

Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents. This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs. In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z)
MIST: Defending Against Membership Inference Attacks Through Membership-Invariant Subspace Training [20.303439793769854]
Member Inference (MI) attacks are a major privacy concern when using private data to train machine learning (ML) models. We introduce a novel Membership-Invariant Subspace Training (MIST) method to defend against MI attacks.
arXiv Detail & Related papers (2023-11-02T01:25:49Z)
Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks [72.03945355787776]
We advocate MDP, a lightweight, pluggable, and effective defense for PLMs as few-shot learners. We show analytically that MDP creates an interesting dilemma for the attacker to choose between attack effectiveness and detection evasiveness.
arXiv Detail & Related papers (2023-09-23T04:41:55Z)
Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction [2.2336243882030025]
Machine learning models are vulnerable to membership inference attacks (MIAs) This work proposes a defense technique, HAMP, that can achieve both strong membership privacy and high accuracy, without requiring extra data.
arXiv Detail & Related papers (2023-07-04T09:50:33Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Membership Inference Attacks by Exploiting Loss Trajectory [19.900473800648243]
We propose a new attack method, called system, which can exploit the membership information from the whole training process of the target model. Our attack achieves at least 6$times$ higher true-positive rate at a low false-positive rate of 0.1% than existing methods.
arXiv Detail & Related papers (2022-08-31T16:02:26Z)
RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target. RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead. Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z)
How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality. We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers. Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z)
Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness. We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z)
Improving Robustness to Model Inversion Attacks via Mutual Information Regularization [12.079281416410227]
This paper studies defense mechanisms against model inversion (MI) attacks. MI is a type of privacy attacks aimed at inferring information about the training data distribution given the access to a target machine learning model. We propose the Mutual Information Regularization based Defense (MID) against MI attacks.
arXiv Detail & Related papers (2020-09-11T06:02:44Z)
On the Difficulty of Membership Inference Attacks [11.172550334631921]
Recent studies propose membership inference (MI) attacks on deep models. Despite their apparent success, these studies only report accuracy, precision, and recall of the positive class (member class) We show that the way the MI attack performance has been reported is often misleading because they suffer from high false positive rate or false alarm rate (FAR) that has not been reported.
arXiv Detail & Related papers (2020-05-27T23:09:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.