Related papers: Why Train More? Effective and Efficient Membership Inference via Memorization

Why Train More? Effective and Efficient Membership Inference via Memorization

URL: http://arxiv.org/abs/2310.08015v1
Date: Thu, 12 Oct 2023 03:29:53 GMT
Title: Why Train More? Effective and Efficient Membership Inference via Memorization
Authors: Jihye Choi, Shruti Tople, Varun Chandrasekaran, Somesh Jha
Abstract summary: Membership Inference Attacks aim to identify specific data samples within the private training dataset of machine learning models. By strategically choosing the samples, MI adversaries can maximize their attack success while minimizing the number of shadow models.
Score: 34.13594460560715
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Membership Inference Attacks (MIAs) aim to identify specific data samples within the private training dataset of machine learning models, leading to serious privacy violations and other sophisticated threats. Many practical black-box MIAs require query access to the data distribution (the same distribution where the private data is drawn) to train shadow models. By doing so, the adversary obtains models trained "with" or "without" samples drawn from the distribution, and analyzes the characteristics of the samples under consideration. The adversary is often required to train more than hundreds of shadow models to extract the signals needed for MIAs; this becomes the computational overhead of MIAs. In this paper, we propose that by strategically choosing the samples, MI adversaries can maximize their attack success while minimizing the number of shadow models. First, our motivational experiments suggest memorization as the key property explaining disparate sample vulnerability to MIAs. We formalize this through a theoretical bound that connects MI advantage with memorization. Second, we show sample complexity bounds that connect the number of shadow models needed for MIAs with memorization. Lastly, we confirm our theoretical arguments with comprehensive experiments; by utilizing samples with high memorization scores, the adversary can (a) significantly improve its efficacy regardless of the MIA used, and (b) reduce the number of shadow models by nearly two orders of magnitude compared to state-of-the-art approaches.

Related papers

Redistribute Ensemble Training for Mitigating Memorization in Diffusion Models [31.92526915009259]
Diffusion models are known for their tremendous ability to generate high-quality samples. Recent methods for memory mitigation have primarily addressed the issue within the context of the text modality. We propose a novel method for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization.
arXiv Detail & Related papers (2025-02-13T15:56:44Z)
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs [61.04246774006429]
We introduce a black-box prompt optimization method that uses an attacker LLM agent to uncover higher levels of memorization in a victim agent. We observe that our instruction-based prompts generate outputs with 23.7% higher overlap with training data compared to the baseline prefix-suffix measurements. Our findings show that instruction-tuned models can expose pre-training data as much as their base-models, if not more so, and using instructions proposed by other LLMs can open a new avenue of automated attacks.
arXiv Detail & Related papers (2024-03-05T19:32:01Z)
Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack. We exploit text similarity and the model's resistance to document modifications as potential MI signals. We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z)
Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability [70.72426887518517]
Out-of-distribution (OOD) detection is an indispensable aspect of secure AI when deploying machine learning models in real-world applications. We propose a novel method, Unleashing Mask, which aims to restore the OOD discriminative capabilities of the well-trained model with ID data. Our method utilizes a mask to figure out the memorized atypical samples, and then finetune the model or prune it with the introduced mask to forget them.
arXiv Detail & Related papers (2023-06-06T14:23:34Z)
Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint. We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC) SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z)
Improving Adversarial Robustness via Mutual Information Estimation [144.33170440878519]
Deep neural networks (DNNs) are found to be vulnerable to adversarial noise. In this paper, we investigate the dependence between outputs of the target model and input adversarial samples from the perspective of information theory. We propose to enhance the adversarial robustness by maximizing the natural MI and minimizing the adversarial MI during the training process.
arXiv Detail & Related papers (2022-07-25T13:45:11Z)
An Efficient Subpopulation-based Membership Inference Attack [11.172550334631921]
We introduce a fundamentally different MI attack approach which obviates the need to train hundreds of shadow models. We achieve the state-of-the-art membership inference accuracy while significantly reducing the training cost.
arXiv Detail & Related papers (2022-03-04T00:52:06Z)
Reconstruction-Based Membership Inference Attacks are Easier on Difficult Problems [36.13835940345486]
We show that models with higher dimensional input and output are more vulnerable to membership inference attacks. We propose using a novel predictability score that can be computed for each sample, and its computation does not require a training set. Our membership error, obtained by subtracting the predictability score from the reconstruction error, is shown to achieve high MIA accuracy on an extensive number of benchmarks.
arXiv Detail & Related papers (2021-02-15T18:57:22Z)
Practical Blind Membership Inference Attack via Differential Comparisons [22.582872789369752]
Membership inference (MI) attacks affect user privacy by inferring whether given data samples have been used to train a target learning model. BlindMI probes the target model and extracts membership semantics via a novel approach, called differential comparison. BlindMI was evaluated by comparing it with state-of-the-art MI attack algorithms.
arXiv Detail & Related papers (2021-01-05T04:07:15Z)
Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters. We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data. Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
On the Difficulty of Membership Inference Attacks [11.172550334631921]
Recent studies propose membership inference (MI) attacks on deep models. Despite their apparent success, these studies only report accuracy, precision, and recall of the positive class (member class) We show that the way the MI attack performance has been reported is often misleading because they suffer from high false positive rate or false alarm rate (FAR) that has not been reported.
arXiv Detail & Related papers (2020-05-27T23:09:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.