Make the Most of Everything: Further Considerations on Disrupting Diffusion-based Customization
- URL: http://arxiv.org/abs/2503.13945v1
- Date: Tue, 18 Mar 2025 06:22:03 GMT
- Title: Make the Most of Everything: Further Considerations on Disrupting Diffusion-based Customization
- Authors: Long Tang, Dengpan Ye, Sirun Chen, Xiuwen Shi, Yunna Lv, Ziyi Liu,
- Abstract summary: We propose Dual Anti-Diffusion (DADiff), a two-stage adversarial attack targeting diffusion customization.<n> Experimental results on various mainstream facial datasets demonstrate 10%-30% improvements in cross-prompt, keyword mismatch, cross-model, and cross-mechanism anti-customization.
- Score: 11.704329867109237
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The fine-tuning technique for text-to-image diffusion models facilitates image customization but risks privacy breaches and opinion manipulation. Current research focuses on prompt- or image-level adversarial attacks for anti-customization, yet it overlooks the correlation between these two levels and the relationship between internal modules and inputs. This hinders anti-customization performance in practical threat scenarios. We propose Dual Anti-Diffusion (DADiff), a two-stage adversarial attack targeting diffusion customization, which, for the first time, integrates the adversarial prompt-level attack into the generation process of image-level adversarial examples. In stage 1, we generate prompt-level adversarial vectors to guide the subsequent image-level attack. In stage 2, besides conducting the end-to-end attack on the UNet model, we disrupt its self- and cross-attention modules, aiming to break the correlations between image pixels and align the cross-attention results computed using instance prompts and adversarial prompt vectors within the images. Furthermore, we introduce a local random timestep gradient ensemble strategy, which updates adversarial perturbations by integrating random gradients from multiple segmented timesets. Experimental results on various mainstream facial datasets demonstrate 10%-30% improvements in cross-prompt, keyword mismatch, cross-model, and cross-mechanism anti-customization with DADiff compared to existing methods.
Related papers
- Towards Invisible Backdoor Attack on Text-to-Image Diffusion Model [70.03122709795122]
Backdoor attacks targeting text-to-image diffusion models have advanced rapidly.
Current backdoor samples often exhibit two key abnormalities compared to benign samples.
We propose a novel Invisible Backdoor Attack (IBA) to enhance the stealthiness of backdoor samples.
arXiv Detail & Related papers (2025-03-22T10:41:46Z) - PB-UAP: Hybrid Universal Adversarial Attack For Image Segmentation [15.702469692874816]
We propose a novel universal adversarial attack method designed for segmentation models.<n>Our method achieves high attack success rates surpassing the state-of-the-art methods, and exhibits strong transferability across different models.
arXiv Detail & Related papers (2024-12-21T14:46:01Z) - Boosting Imperceptibility of Stable Diffusion-based Adversarial Examples Generation with Momentum [13.305800254250789]
We propose a novel framework, Stable Diffusion-based Momentum Integrated Adversarial Examples (SD-MIAE)
It generates adversarial examples that can effectively mislead neural network classifiers while maintaining visual imperceptibility and preserving the semantic similarity to the original class label.
Experimental results demonstrate that SD-MIAE achieves a high misclassification rate of 79%, improving by 35% over the state-of-the-art method.
arXiv Detail & Related papers (2024-10-17T01:22:11Z) - AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - Self-Supervised Representation Learning for Adversarial Attack Detection [6.528181610035978]
Supervised learning-based adversarial attack detection methods rely on a large number of labeled data.
We propose a self-supervised representation learning framework for the adversarial attack detection task to address this drawback.
arXiv Detail & Related papers (2024-07-05T09:37:16Z) - Disrupting Diffusion: Token-Level Attention Erasure Attack against Diffusion-based Customization [19.635385099376066]
malicious users have misused diffusion-based customization methods like DreamBooth to create fake images.
In this paper, we propose DisDiff, a novel adversarial attack method to disrupt the diffusion model outputs.
arXiv Detail & Related papers (2024-05-31T02:45:31Z) - Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations.
We transform the multi-granular attack into a sequential decision-making process.
Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z) - IRAD: Implicit Representation-driven Image Resampling against Adversarial Attacks [16.577595936609665]
We introduce a novel approach to counter adversarial attacks, namely, image resampling.
Image resampling transforms a discrete image into a new one, simulating the process of scene recapturing or rerendering as specified by a geometrical transformation.
We show that our method significantly enhances the adversarial robustness of diverse deep models against various attacks while maintaining high accuracy on clean images.
arXiv Detail & Related papers (2023-10-18T11:19:32Z) - Improving Adversarial Robustness of Masked Autoencoders via Test-time
Frequency-domain Prompting [133.55037976429088]
We investigate the adversarial robustness of vision transformers equipped with BERT pretraining (e.g., BEiT, MAE)
A surprising observation is that MAE has significantly worse adversarial robustness than other BERT pretraining methods.
We propose a simple yet effective way to boost the adversarial robustness of MAE.
arXiv Detail & Related papers (2023-08-20T16:27:17Z) - Improving Adversarial Transferability via Intermediate-level
Perturbation Decay [79.07074710460012]
We develop a novel intermediate-level method that crafts adversarial examples within a single stage of optimization.
Experimental results show that it outperforms state-of-the-arts by large margins in attacking various victim models.
arXiv Detail & Related papers (2023-04-26T09:49:55Z) - Understanding Adversarial Examples from the Mutual Influence of Images
and Perturbations [83.60161052867534]
We analyze adversarial examples by disentangling the clean images and adversarial perturbations, and analyze their influence on each other.
Our results suggest a new perspective towards the relationship between images and universal perturbations.
We are the first to achieve the challenging task of a targeted universal attack without utilizing original training data.
arXiv Detail & Related papers (2020-07-13T05:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.