MIA-BAD: An Approach for Enhancing Membership Inference Attack and its
Mitigation with Federated Learning
- URL: http://arxiv.org/abs/2312.00051v1
- Date: Tue, 28 Nov 2023 06:51:26 GMT
- Title: MIA-BAD: An Approach for Enhancing Membership Inference Attack and its
Mitigation with Federated Learning
- Authors: Soumya Banerjee, Sandip Roy, Sayyed Farid Ahamed, Devin Quinn, Marc
Vucovich, Dhruv Nandakumar, Kevin Choi, Abdul Rahman, Edward Bowen, and
Sachin Shetty
- Abstract summary: The membership inference attack (MIA) is a popular paradigm for compromising the privacy of a machine learning (ML) model.
We propose an enhanced Membership Inference Attack with the Batch-wise generated Attack dataset (MIA-BAD)
We show how training an ML model through FL, has some distinct advantages and investigate how the threat introduced with the proposed MIA-BAD approach can be mitigated with FL approaches.
- Score: 6.510488168434277
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The membership inference attack (MIA) is a popular paradigm for compromising
the privacy of a machine learning (ML) model. MIA exploits the natural
inclination of ML models to overfit upon the training data. MIAs are trained to
distinguish between training and testing prediction confidence to infer
membership information. Federated Learning (FL) is a privacy-preserving ML
paradigm that enables multiple clients to train a unified model without
disclosing their private data. In this paper, we propose an enhanced Membership
Inference Attack with the Batch-wise generated Attack Dataset (MIA-BAD), a
modification to the MIA approach. We investigate that the MIA is more accurate
when the attack dataset is generated batch-wise. This quantitatively decreases
the attack dataset while qualitatively improving it. We show how training an ML
model through FL, has some distinct advantages and investigate how the threat
introduced with the proposed MIA-BAD approach can be mitigated with FL
approaches. Finally, we demonstrate the qualitative effects of the proposed
MIA-BAD methodology by conducting extensive experiments with various target
datasets, variable numbers of federated clients, and training batch sizes.
Related papers
- Detecting Training Data of Large Language Models via Expectation Maximization [62.28028046993391]
Membership inference attacks (MIAs) aim to determine whether a specific instance was part of a target model's training data.
Applying MIAs to large language models (LLMs) presents unique challenges due to the massive scale of pre-training data and the ambiguous nature of membership.
We introduce EM-MIA, a novel MIA method for LLMs that iteratively refines membership scores and prefix scores via an expectation-maximization algorithm.
arXiv Detail & Related papers (2024-10-10T03:31:16Z) - Order of Magnitude Speedups for LLM Membership Inference [5.124111136127848]
Large Language Models (LLMs) have the promise to revolutionize computing broadly, but their complexity and extensive training data also expose privacy vulnerabilities.
One of the simplest privacy risks associated with LLMs is their susceptibility to membership inference attacks (MIAs)
We propose a low-cost MIA that leverages an ensemble of small quantile regression models to determine if a document belongs to the model's training set or not.
arXiv Detail & Related papers (2024-09-22T16:18:14Z) - Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data.
We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters.
We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z) - Evaluating Membership Inference Attacks and Defenses in Federated
Learning [23.080346952364884]
Membership Inference Attacks (MIAs) pose a growing threat to privacy preservation in federated learning.
This paper conducts an evaluation of existing MIAs and corresponding defense strategies.
arXiv Detail & Related papers (2024-02-09T09:58:35Z) - Assessing Privacy Risks in Language Models: A Case Study on
Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack.
We exploit text similarity and the model's resistance to document modifications as potential MI signals.
We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Do Gradient Inversion Attacks Make Federated Learning Unsafe? [70.0231254112197]
Federated learning (FL) allows the collaborative training of AI models without needing to share raw data.
Recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data.
In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack.
arXiv Detail & Related papers (2022-02-14T18:33:12Z) - FAT: Federated Adversarial Training [5.287156503763459]
Federated learning (FL) is one of the most important paradigms addressing privacy and data governance issues in machine learning (ML)
We take the first known steps towards federated adversarial training (FAT) combining both methods to reduce the threat of evasion during inference while preserving the data privacy during training.
arXiv Detail & Related papers (2020-12-03T09:47:47Z) - How Does Data Augmentation Affect Privacy in Machine Learning? [94.52721115660626]
We propose new MI attacks to utilize the information of augmented data.
We establish the optimal membership inference when the model is trained with augmented data.
arXiv Detail & Related papers (2020-07-21T02:21:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.