A Probabilistic Fluctuation based Membership Inference Attack for Diffusion Models
- URL: http://arxiv.org/abs/2308.12143v4
- Date: Tue, 25 Jun 2024 12:34:46 GMT
- Title: A Probabilistic Fluctuation based Membership Inference Attack for Diffusion Models
- Authors: Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang,
- Abstract summary: Membership Inference Attack (MIA) identifies whether a record exists in a machine learning model's training set by querying the model.
We propose a Probabilistic Fluctuation Assessing Membership Inference Attack (PFAMI)
PFAMI can improve the attack success rate (ASR) by about 27.9% when compared with the best baseline.
- Score: 32.15773300068426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Membership Inference Attack (MIA) identifies whether a record exists in a machine learning model's training set by querying the model. MIAs on the classic classification models have been well-studied, and recent works have started to explore how to transplant MIA onto generative models. Our investigation indicates that existing MIAs designed for generative models mainly depend on the overfitting in target models. However, overfitting can be avoided by employing various regularization techniques, whereas existing MIAs demonstrate poor performance in practice. Unlike overfitting, memorization is essential for deep learning models to attain optimal performance, making it a more prevalent phenomenon. Memorization in generative models leads to an increasing trend in the probability distribution of generating records around the member record. Therefore, we propose a Probabilistic Fluctuation Assessing Membership Inference Attack (PFAMI), a black-box MIA that infers memberships by detecting these trends via analyzing the overall probabilistic fluctuations around given records. We conduct extensive experiments across multiple generative models and datasets, which demonstrate PFAMI can improve the attack success rate (ASR) by about 27.9% when compared with the best baseline.
Related papers
- On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations.
We propose an autoregressive sampling approach that significantly improves performance in forecasting.
We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z) - MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion Models [12.386823277082746]
Membership inference attacks (MIAs) on diffusion models have emerged as potential evidence of unauthorized data usage.
Our study delves into the evaluation of state-of-the-art MIAs on diffusion models and reveals critical flaws and overly optimistic performance estimates.
We introduce CopyMark, a more realistic MIA benchmark that distinguishes itself through the support for pre-trained diffusion models, unbiased datasets, and fair evaluation pipelines.
arXiv Detail & Related papers (2024-10-04T17:46:06Z) - Model Will Tell: Training Membership Inference for Diffusion Models [15.16244745642374]
Training Membership Inference (TMI) task aims to determine whether a specific sample has been used in the training process of a target model.
In this paper, we explore a novel perspective for the TMI task by leveraging the intrinsic generative priors within the diffusion model.
arXiv Detail & Related papers (2024-03-13T12:52:37Z) - Fairness Feedback Loops: Training on Synthetic Data Amplifies Bias [47.79659355705916]
Model-induced distribution shifts (MIDS) occur as previous model outputs pollute new model training sets over generations of models.
We introduce a framework that allows us to track multiple MIDS over many generations, finding that they can lead to loss in performance, fairness, and minoritized group representation.
Despite these negative consequences, we identify how models might be used for positive, intentional, interventions in their data ecosystems.
arXiv Detail & Related papers (2024-03-12T17:48:08Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - When Fairness Meets Privacy: Exploring Privacy Threats in Fair Binary Classifiers via Membership Inference Attacks [17.243744418309593]
We propose an efficient MIA method against fairness-enhanced models based on fairness discrepancy results.
We also explore potential strategies for mitigating privacy leakages.
arXiv Detail & Related papers (2023-11-07T10:28:17Z) - On Memorization in Diffusion Models [46.656797890144105]
We show that memorization behaviors tend to occur on smaller-sized datasets.
We quantify the impact of the influential factors on these memorization behaviors in terms of effective model memorization (EMM)
Our study holds practical significance for diffusion model users and offers clues to theoretical research in deep generative models.
arXiv Detail & Related papers (2023-10-04T09:04:20Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.