Overconfidence is a Dangerous Thing: Mitigating Membership Inference
Attacks by Enforcing Less Confident Prediction
- URL: http://arxiv.org/abs/2307.01610v1
- Date: Tue, 4 Jul 2023 09:50:33 GMT
- Title: Overconfidence is a Dangerous Thing: Mitigating Membership Inference
Attacks by Enforcing Less Confident Prediction
- Authors: Zitao Chen, Karthik Pattabiraman
- Abstract summary: Machine learning models are vulnerable to membership inference attacks (MIAs)
This work proposes a defense technique, HAMP, that can achieve both strong membership privacy and high accuracy, without requiring extra data.
- Score: 2.2336243882030025
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) models are vulnerable to membership inference attacks
(MIAs), which determine whether a given input is used for training the target
model. While there have been many efforts to mitigate MIAs, they often suffer
from limited privacy protection, large accuracy drop, and/or requiring
additional data that may be difficult to acquire. This work proposes a defense
technique, HAMP that can achieve both strong membership privacy and high
accuracy, without requiring extra data. To mitigate MIAs in different forms, we
observe that they can be unified as they all exploit the ML model's
overconfidence in predicting training samples through different proxies. This
motivates our design to enforce less confident prediction by the model, hence
forcing the model to behave similarly on the training and testing samples. HAMP
consists of a novel training framework with high-entropy soft labels and an
entropy-based regularizer to constrain the model's prediction while still
achieving high accuracy. To further reduce privacy risk, HAMP uniformly
modifies all the prediction outputs to become low-confidence outputs while
preserving the accuracy, which effectively obscures the differences between the
prediction on members and non-members. We conduct extensive evaluation on five
benchmark datasets, and show that HAMP provides consistently high accuracy and
strong membership privacy. Our comparison with seven state-of-the-art defenses
shows that HAMP achieves a superior privacy-utility trade off than those
techniques.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Confidence Aware Learning for Reliable Face Anti-spoofing [52.23271636362843]
We propose a Confidence Aware Face Anti-spoofing model, which is aware of its capability boundary.
We estimate its confidence during the prediction of each sample.
Experiments show that the proposed CA-FAS can effectively recognize samples with low prediction confidence.
arXiv Detail & Related papers (2024-11-02T14:29:02Z) - Perturbation-Invariant Adversarial Training for Neural Ranking Models:
Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents.
This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs.
In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z) - Diffence: Fencing Membership Privacy With Diffusion Models [14.633898825111828]
Deep learning models are vulnerable to membership inference attacks (MIAs)
We introduce a novel defense framework against MIAs by leveraging generative models.
Our defense, called DIFFENCE, works pre inference, which is unlike prior defenses that are either training-time or post-inference time.
arXiv Detail & Related papers (2023-12-07T20:45:09Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Bounding Membership Inference [28.64031194463754]
We provide a tighter bound on the accuracy of any MI adversary when a training algorithm provides $epsilon$-DP.
Our scheme enables $epsilon$-DP users to employ looser DP guarantees when training their model to limit the success of any MI adversary.
arXiv Detail & Related papers (2022-02-24T17:54:15Z) - Do Not Trust Prediction Scores for Membership Inference Attacks [15.567057178736402]
Membership inference attacks (MIAs) aim to determine whether a specific sample was used to train a predictive model.
We argue that this is a fallacy for many modern deep network architectures.
We are able to produce a potentially infinite number of samples falsely classified as part of the training data.
arXiv Detail & Related papers (2021-11-17T12:39:04Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Trade-offs between membership privacy & adversarially robust learning [13.37805637358556]
We identify settings where standard models will overfit to a larger extent in comparison to robust models.
The degree of overfitting naturally depends on the amount of data available for training.
arXiv Detail & Related papers (2020-06-08T14:20:12Z) - Membership Inference Attacks and Defenses in Classification Models [19.498313593713043]
We study the membership inference (MI) attack against classifiers.
We find that a model's vulnerability to MI attacks is tightly related to the generalization gap.
We propose a defense against MI attacks that aims to close the gap by intentionally reducing the training accuracy.
arXiv Detail & Related papers (2020-02-27T12:35:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.