Membership Inference Attack Should Move On to Distributional Statistics for Distilled Generative Models
- URL: http://arxiv.org/abs/2502.02970v1
- Date: Wed, 05 Feb 2025 08:11:23 GMT
- Title: Membership Inference Attack Should Move On to Distributional Statistics for Distilled Generative Models
- Authors: Muxing Li, Zesheng Ye, Yixuan Li, Andy Song, Guangquan Zhang, Feng Liu,
- Abstract summary: Membership inference attacks (MIAs) determine whether certain data instances were used to train a model.
This paper reveals an oversight in existing MIAs against emphdistilled generative models
We introduce a emphset-based MIA framework that measures emphrelative distributional discrepancies between student-generated dataemphsets and potential member/non-member dataemphsets
- Score: 31.834967019893227
- License:
- Abstract: Membership inference attacks (MIAs) determine whether certain data instances were used to train a model by exploiting the differences in how the model responds to seen versus unseen instances. This capability makes MIAs important in assessing privacy leakage within modern generative AI systems. However, this paper reveals an oversight in existing MIAs against \emph{distilled generative models}: attackers can no longer detect a teacher model's training instances individually when targeting the distilled student model, as the student learns from the teacher-generated data rather than its original member data, preventing direct instance-level memorization. Nevertheless, we find that student-generated samples exhibit a significantly stronger distributional alignment with teacher's member data than non-member data. This leads us to posit that MIAs \emph{on distilled generative models should shift from instance-level to distribution-level statistics}. We thereby introduce a \emph{set-based} MIA framework that measures \emph{relative} distributional discrepancies between student-generated data\emph{sets} and potential member/non-member data\emph{sets}, Empirically, distributional statistics reliably distinguish a teacher's member data from non-member data through the distilled model. Finally, we discuss scenarios in which our setup faces limitations.
Related papers
- Self-Comparison for Dataset-Level Membership Inference in Large (Vision-)Language Models [73.94175015918059]
We propose a dataset-level membership inference method based on Self-Comparison.
Our method does not require access to ground-truth member data or non-member data in identical distribution.
arXiv Detail & Related papers (2024-10-16T23:05:59Z) - Blind Baselines Beat Membership Inference Attacks for Foundation Models [24.010279957557252]
Membership inference (MI) attacks try to determine if a data sample was used to train a machine learning model.
For foundation models trained on unknown Web data, MI attacks can be used to detect copyrighted training materials, measure test set contamination, or audit machine unlearning.
We show that evaluations of MI attacks for foundation models are flawed, because they sample members and non-members from different distributions.
arXiv Detail & Related papers (2024-06-23T19:40:11Z) - Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data.
We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters.
We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z) - Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration [32.15773300068426]
Membership Inference Attacks aim to infer whether a target data record has been utilized for model training.
We propose a Membership Inference Attack based on Self-calibrated Probabilistic Variation (SPV-MIA)
arXiv Detail & Related papers (2023-11-10T13:55:05Z) - Assessing Privacy Risks in Language Models: A Case Study on
Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack.
We exploit text similarity and the model's resistance to document modifications as potential MI signals.
We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - Investigating Membership Inference Attacks under Data Dependencies [26.70764798408236]
Training machine learning models on privacy-sensitive data has opened the door to new attacks that can have serious privacy implications.
One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model.
We evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed.
arXiv Detail & Related papers (2020-10-23T00:16:46Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - How Does Data Augmentation Affect Privacy in Machine Learning? [94.52721115660626]
We propose new MI attacks to utilize the information of augmented data.
We establish the optimal membership inference when the model is trained with augmented data.
arXiv Detail & Related papers (2020-07-21T02:21:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.