Diffusion Models for Imperceptible and Transferable Adversarial Attack
- URL: http://arxiv.org/abs/2305.08192v2
- Date: Thu, 30 Nov 2023 14:40:54 GMT
- Title: Diffusion Models for Imperceptible and Transferable Adversarial Attack
- Authors: Jianqi Chen, Hao Chen, Keyan Chen, Yilan Zhang, Zhengxia Zou, Zhenwei
Shi
- Abstract summary: We propose a novel imperceptible and transferable attack by leveraging both the generative and discriminative power of diffusion models.
Our proposed method, DiffAttack, is the first that introduces diffusion models into the adversarial attack field.
- Score: 23.991194050494396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many existing adversarial attacks generate $L_p$-norm perturbations on image
RGB space. Despite some achievements in transferability and attack success
rate, the crafted adversarial examples are easily perceived by human eyes.
Towards visual imperceptibility, some recent works explore unrestricted attacks
without $L_p$-norm constraints, yet lacking transferability of attacking
black-box models. In this work, we propose a novel imperceptible and
transferable attack by leveraging both the generative and discriminative power
of diffusion models. Specifically, instead of direct manipulation in pixel
space, we craft perturbations in the latent space of diffusion models. Combined
with well-designed content-preserving structures, we can generate
human-insensitive perturbations embedded with semantic clues. For better
transferability, we further "deceive" the diffusion model which can be viewed
as an implicit recognition surrogate, by distracting its attention away from
the target regions. To our knowledge, our proposed method, DiffAttack, is the
first that introduces diffusion models into the adversarial attack field.
Extensive experiments on various model structures, datasets, and defense
methods have demonstrated the superiority of our attack over the existing
attack methods.
Related papers
- Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than We Think [14.583181596370386]
Adversarial examples for diffusion models are widely used as solutions for safety concerns.
This may mislead us to think that the diffusion models are vulnerable to adversarial attacks like most deep models.
In this paper, we show novel findings that: even though gradient-based white-box attacks can be used to attack the LDMs, they fail to attack PDMs.
arXiv Detail & Related papers (2024-04-20T08:28:43Z) - Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent
Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space.
Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings.
The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z) - Improving Adversarial Transferability by Stable Diffusion [36.97548018603747]
adversarial examples introduce imperceptible perturbations to benign samples, deceiving predictions.
Deep neural networks (DNNs) are susceptible to adversarial examples, which introduce imperceptible perturbations to benign samples, deceiving predictions.
We introduce a novel attack method called Stable Diffusion Attack Method (SDAM), which incorporates samples generated by Stable Diffusion to augment input images.
arXiv Detail & Related papers (2023-11-18T09:10:07Z) - Semantic Adversarial Attacks via Diffusion Models [30.169827029761702]
Semantic adversarial attacks focus on changing semantic attributes of clean examples, such as color, context, and features.
We propose a framework to quickly generate a semantic adversarial attack by leveraging recent diffusion models.
Our approaches achieve approximately 100% attack success rate in multiple settings with the best FID as 36.61.
arXiv Detail & Related papers (2023-09-14T02:57:48Z) - Data Forensics in Diffusion Models: A Systematic Analysis of Membership
Privacy [62.16582309504159]
We develop a systematic analysis of membership inference attacks on diffusion models and propose novel attack methods tailored to each attack scenario.
Our approach exploits easily obtainable quantities and is highly effective, achieving near-perfect attack performance (>0.9 AUCROC) in realistic scenarios.
arXiv Detail & Related papers (2023-02-15T17:37:49Z) - Towards Understanding and Boosting Adversarial Transferability from a
Distribution Perspective [80.02256726279451]
adversarial attacks against Deep neural networks (DNNs) have received broad attention in recent years.
We propose a novel method that crafts adversarial examples by manipulating the distribution of the image.
Our method can significantly improve the transferability of the crafted attacks and achieves state-of-the-art performance in both untargeted and targeted scenarios.
arXiv Detail & Related papers (2022-10-09T09:58:51Z) - Frequency Domain Model Augmentation for Adversarial Attack [91.36850162147678]
For black-box attacks, the gap between the substitute model and the victim model is usually large.
We propose a novel spectrum simulation attack to craft more transferable adversarial examples against both normally trained and defense models.
arXiv Detail & Related papers (2022-07-12T08:26:21Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Boosting the Transferability of Video Adversarial Examples via Temporal
Translation [82.0745476838865]
adversarial examples are transferable, which makes them feasible for black-box attacks in real-world applications.
We introduce a temporal translation attack method, which optimize the adversarial perturbations over a set of temporal translated video clips.
Experiments on the Kinetics-400 dataset and the UCF-101 dataset demonstrate that our method can significantly boost the transferability of video adversarial examples.
arXiv Detail & Related papers (2021-10-18T07:52:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.