Towards More Realistic Membership Inference Attacks on Large Diffusion
Models
- URL: http://arxiv.org/abs/2306.12983v2
- Date: Thu, 16 Nov 2023 14:57:58 GMT
- Title: Towards More Realistic Membership Inference Attacks on Large Diffusion
Models
- Authors: Jan Dubi\'nski, Antoni Kowalczuk, Stanis{\l}aw Pawlak, Przemys{\l}aw
Rokita, Tomasz Trzci\'nski, Pawe{\l} Morawiecki
- Abstract summary: Generative diffusion models, including Stable Diffusion and Midjourney, can generate visually appealing, diverse, and high-resolution images for various applications.
These models are trained on billions of internet-sourced images, raising significant concerns about the potential unauthorized use of copyright-protected images.
In this paper, we examine whether it is possible to determine if a specific image was used in the training set, a problem known in the cybersecurity community and referred to as a membership inference attack.
- Score: 13.327985433287477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative diffusion models, including Stable Diffusion and Midjourney, can
generate visually appealing, diverse, and high-resolution images for various
applications. These models are trained on billions of internet-sourced images,
raising significant concerns about the potential unauthorized use of
copyright-protected images. In this paper, we examine whether it is possible to
determine if a specific image was used in the training set, a problem known in
the cybersecurity community and referred to as a membership inference attack.
Our focus is on Stable Diffusion, and we address the challenge of designing a
fair evaluation framework to answer this membership question. We propose a
methodology to establish a fair evaluation setup and apply it to Stable
Diffusion, enabling potential extensions to other generative models. Utilizing
this evaluation setup, we execute membership attacks (both known and newly
introduced). Our research reveals that previously proposed evaluation setups do
not provide a full understanding of the effectiveness of membership inference
attacks. We conclude that the membership inference attack remains a significant
challenge for large diffusion models (often deployed as black-box systems),
indicating that related privacy and copyright issues will persist in the
foreseeable future.
Related papers
- Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models [58.065255696601604]
We use compositional property of diffusion models, which allows to leverage multiple prompts in a single image generation.
We argue that it is essential to consider all possible approaches to image generation with diffusion models that can be employed by an adversary.
arXiv Detail & Related papers (2024-04-21T16:35:16Z) - Bridging Generative and Discriminative Models for Unified Visual
Perception with Diffusion Priors [56.82596340418697]
We propose a simple yet effective framework comprising a pre-trained Stable Diffusion (SD) model containing rich generative priors, a unified head (U-head) capable of integrating hierarchical representations, and an adapted expert providing discriminative priors.
Comprehensive investigations unveil potential characteristics of Vermouth, such as varying granularity of perception concealed in latent variables at distinct time steps and various U-net stages.
The promising results demonstrate the potential of diffusion models as formidable learners, establishing their significance in furnishing informative and robust visual representations.
arXiv Detail & Related papers (2024-01-29T10:36:57Z) - Revealing Vulnerabilities in Stable Diffusion via Targeted Attacks [41.531913152661296]
We formulate the problem of targeted adversarial attack on Stable Diffusion and propose a framework to generate adversarial prompts.
Specifically, we design a gradient-based embedding optimization method to craft reliable adversarial prompts that guide stable diffusion to generate specific images.
After obtaining successful adversarial prompts, we reveal the mechanisms that cause the vulnerability of the model.
arXiv Detail & Related papers (2024-01-16T12:15:39Z) - Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent
Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space.
Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings.
The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z) - Privacy Threats in Stable Diffusion Models [0.7366405857677227]
This paper introduces a novel approach to membership inference attacks (MIA) targeting stable diffusion computer vision models.
MIAs aim to extract sensitive information about a model's training data, posing significant privacy concerns.
We devise a black-box MIA that only needs to query the victim model repeatedly.
arXiv Detail & Related papers (2023-11-15T20:31:40Z) - Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models? [52.238883592674696]
Ring-A-Bell is a model-agnostic red-teaming tool for T2I diffusion models.
It identifies problematic prompts for diffusion models with the corresponding generation of inappropriate content.
Our results show that Ring-A-Bell, by manipulating safe prompting benchmarks, can transform prompts that were originally regarded as safe to evade existing safety mechanisms.
arXiv Detail & Related papers (2023-10-16T02:11:20Z) - Data Forensics in Diffusion Models: A Systematic Analysis of Membership
Privacy [62.16582309504159]
We develop a systematic analysis of membership inference attacks on diffusion models and propose novel attack methods tailored to each attack scenario.
Our approach exploits easily obtainable quantities and is highly effective, achieving near-perfect attack performance (>0.9 AUCROC) in realistic scenarios.
arXiv Detail & Related papers (2023-02-15T17:37:49Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.