MIST: Defending Against Membership Inference Attacks Through Membership-Invariant Subspace Training
- URL: http://arxiv.org/abs/2311.00919v2
- Date: Wed, 29 May 2024 05:34:09 GMT
- Title: MIST: Defending Against Membership Inference Attacks Through Membership-Invariant Subspace Training
- Authors: Jiacheng Li, Ninghui Li, Bruno Ribeiro,
- Abstract summary: Member Inference (MI) attacks are a major privacy concern when using private data to train machine learning (ML) models.
We introduce a novel Membership-Invariant Subspace Training (MIST) method to defend against MI attacks.
- Score: 20.303439793769854
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In Member Inference (MI) attacks, the adversary try to determine whether an instance is used to train a machine learning (ML) model. MI attacks are a major privacy concern when using private data to train ML models. Most MI attacks in the literature take advantage of the fact that ML models are trained to fit the training data well, and thus have very low loss on training instances. Most defenses against MI attacks therefore try to make the model fit the training data less well. Doing so, however, generally results in lower accuracy. We observe that training instances have different degrees of vulnerability to MI attacks. Most instances will have low loss even when not included in training. For these instances, the model can fit them well without concerns of MI attacks. An effective defense only needs to (possibly implicitly) identify instances that are vulnerable to MI attacks and avoids overfitting them. A major challenge is how to achieve such an effect in an efficient training process. Leveraging two distinct recent advancements in representation learning: counterfactually-invariant representations and subspace learning methods, we introduce a novel Membership-Invariant Subspace Training (MIST) method to defend against MI attacks. MIST avoids overfitting the vulnerable instances without significant impact on other instances. We have conducted extensive experimental studies, comparing MIST with various other state-of-the-art (SOTA) MI defenses against several SOTA MI attacks. We find that MIST outperforms other defenses while resulting in minimal reduction in testing accuracy.
Related papers
- Defending against Model Inversion Attacks via Random Erasing [24.04876860999608]
We present a new method to defend against Model Inversion (MI) attacks.
Our idea is based on a novel insight on Random Erasing (RE)
We show that RE can lead to substantial degradation in MI reconstruction quality and attack accuracy.
arXiv Detail & Related papers (2024-09-02T08:37:17Z) - Model Inversion Robustness: Can Transfer Learning Help? [27.883074562565877]
Model Inversion (MI) attacks aim to reconstruct private training data by abusing access to machine learning models.
We propose Transfer Learning-based Defense against Model Inversion (TL-DMI) to render MI-robust models.
Our method achieves state-of-the-art (SOTA) MI robustness without bells and whistles.
arXiv Detail & Related papers (2024-05-09T07:24:28Z) - DALA: A Distribution-Aware LoRA-Based Adversarial Attack against
Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data.
Recent attack methods can achieve a relatively high attack success rate (ASR)
We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z) - Defending Pre-trained Language Models as Few-shot Learners against
Backdoor Attacks [72.03945355787776]
We advocate MDP, a lightweight, pluggable, and effective defense for PLMs as few-shot learners.
We show analytically that MDP creates an interesting dilemma for the attacker to choose between attack effectiveness and detection evasiveness.
arXiv Detail & Related papers (2023-09-23T04:41:55Z) - Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Membership Inference Attack Using Self Influence Functions [43.10140199124212]
Member inference (MI) attacks aim to determine if a specific data sample was used to train a machine learning model.
We present a novel MI attack for it that employs influence functions, or more specifically the samples' self-influence scores, to perform the MI prediction.
Our attack method achieves new state-of-the-art results for both training with and without data augmentations.
arXiv Detail & Related papers (2022-05-26T23:52:26Z) - How Does Data Augmentation Affect Privacy in Machine Learning? [94.52721115660626]
We propose new MI attacks to utilize the information of augmented data.
We establish the optimal membership inference when the model is trained with augmented data.
arXiv Detail & Related papers (2020-07-21T02:21:10Z) - On the Difficulty of Membership Inference Attacks [11.172550334631921]
Recent studies propose membership inference (MI) attacks on deep models.
Despite their apparent success, these studies only report accuracy, precision, and recall of the positive class (member class)
We show that the way the MI attack performance has been reported is often misleading because they suffer from high false positive rate or false alarm rate (FAR) that has not been reported.
arXiv Detail & Related papers (2020-05-27T23:09:17Z) - Membership Inference Attacks and Defenses in Classification Models [19.498313593713043]
We study the membership inference (MI) attack against classifiers.
We find that a model's vulnerability to MI attacks is tightly related to the generalization gap.
We propose a defense against MI attacks that aims to close the gap by intentionally reducing the training accuracy.
arXiv Detail & Related papers (2020-02-27T12:35:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.