Related papers: On the Effectiveness of Regularization Against Membership Inference Attacks

On the Effectiveness of Regularization Against Membership Inference Attacks

URL: http://arxiv.org/abs/2006.05336v1
Date: Tue, 9 Jun 2020 15:17:21 GMT
Title: On the Effectiveness of Regularization Against Membership Inference Attacks
Authors: Yigitcan Kaya, Sanghyun Hong, Tudor Dumitras
Abstract summary: Deep learning models often raise privacy concerns as they leak information about their training data. This enables an adversary to determine whether a data point was in a model's training set by conducting a membership inference attack (MIA) While many regularization mechanisms exist, their effectiveness against MIAs has not been studied systematically.
Score: 26.137849584503222
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning models often raise privacy concerns as they leak information about their training data. This enables an adversary to determine whether a data point was in a model's training set by conducting a membership inference attack (MIA). Prior work has conjectured that regularization techniques, which combat overfitting, may also mitigate the leakage. While many regularization mechanisms exist, their effectiveness against MIAs has not been studied systematically, and the resulting privacy properties are not well understood. We explore the lower bound for information leakage that practical attacks can achieve. First, we evaluate the effectiveness of 8 mechanisms in mitigating two recent MIAs, on three standard image classification tasks. We find that certain mechanisms, such as label smoothing, may inadvertently help MIAs. Second, we investigate the potential of improving the resilience to MIAs by combining complementary mechanisms. Finally, we quantify the opportunity of future MIAs to compromise privacy by designing a white-box `distance-to-confident' (DtC) metric, based on adversarial sample crafting. Our metric reveals that, even when existing MIAs fail, the training samples may remain distinguishable from test samples. This suggests that regularization mechanisms can provide a false sense of privacy, even when they appear effective against existing MIAs.

Related papers

Model Inversion Attacks: A Survey of Approaches and Countermeasures [59.986922963781]
Recently, a new type of privacy attack, the model inversion attacks (MIAs), aims to extract sensitive features of private data for training. Despite the significance, there is a lack of systematic studies that provide a comprehensive overview and deeper insights into MIAs. This survey aims to summarize up-to-date MIA methods in both attacks and defenses.
arXiv Detail & Related papers (2024-11-15T08:09:28Z)
Dual-Model Defense: Safeguarding Diffusion Models from Membership Inference Attacks through Disjoint Data Splitting [6.984396318800444]
Diffusion models have been proven to be vulnerable to Membership Inference Attacks (MIAs) This paper introduces two novel and efficient approaches to protect diffusion models against MIAs.
arXiv Detail & Related papers (2024-10-22T03:02:29Z)
Detecting Training Data of Large Language Models via Expectation Maximization [62.28028046993391]
Membership inference attacks (MIAs) aim to determine whether a specific instance was part of a target model's training data. Applying MIAs to large language models (LLMs) presents unique challenges due to the massive scale of pre-training data and the ambiguous nature of membership. We introduce EM-MIA, a novel MIA method for LLMs that iteratively refines membership scores and prefix scores via an expectation-maximization algorithm.
arXiv Detail & Related papers (2024-10-10T03:31:16Z)
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models. It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z)
Evaluating Membership Inference Attacks and Defenses in Federated Learning [23.080346952364884]
Membership Inference Attacks (MIAs) pose a growing threat to privacy preservation in federated learning. This paper conducts an evaluation of existing MIAs and corresponding defense strategies.
arXiv Detail & Related papers (2024-02-09T09:58:35Z)
MIA-BAD: An Approach for Enhancing Membership Inference Attack and its Mitigation with Federated Learning [6.510488168434277]
The membership inference attack (MIA) is a popular paradigm for compromising the privacy of a machine learning (ML) model. We propose an enhanced Membership Inference Attack with the Batch-wise generated Attack dataset (MIA-BAD) We show how training an ML model through FL, has some distinct advantages and investigate how the threat introduced with the proposed MIA-BAD approach can be mitigated with FL approaches.
arXiv Detail & Related papers (2023-11-28T06:51:26Z)
A Comprehensive Study of Privacy Risks in Curriculum Learning [25.57099711643689]
Training a machine learning model with data following a meaningful order has been proven to be effective in accelerating the training process. The key enabling technique is curriculum learning (CL), which has seen great success and has been deployed in areas like image and text classification. Yet, how CL affects the privacy of machine learning is unclear.
arXiv Detail & Related papers (2023-10-16T07:06:38Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Improving Adversarial Robustness via Mutual Information Estimation [144.33170440878519]
Deep neural networks (DNNs) are found to be vulnerable to adversarial noise. In this paper, we investigate the dependence between outputs of the target model and input adversarial samples from the perspective of information theory. We propose to enhance the adversarial robustness by maximizing the natural MI and minimizing the adversarial MI during the training process.
arXiv Detail & Related papers (2022-07-25T13:45:11Z)
RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target. RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead. Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z)
Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model. We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance. For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
Membership Inference Attacks and Defenses in Classification Models [19.498313593713043]
We study the membership inference (MI) attack against classifiers. We find that a model's vulnerability to MI attacks is tightly related to the generalization gap. We propose a defense against MI attacks that aims to close the gap by intentionally reducing the training accuracy.
arXiv Detail & Related papers (2020-02-27T12:35:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.