Semantic Adversarial Attacks via Diffusion Models
- URL: http://arxiv.org/abs/2309.07398v1
- Date: Thu, 14 Sep 2023 02:57:48 GMT
- Title: Semantic Adversarial Attacks via Diffusion Models
- Authors: Chenan Wang, Jinhao Duan, Chaowei Xiao, Edward Kim, Matthew Stamm,
Kaidi Xu
- Abstract summary: Semantic adversarial attacks focus on changing semantic attributes of clean examples, such as color, context, and features.
We propose a framework to quickly generate a semantic adversarial attack by leveraging recent diffusion models.
Our approaches achieve approximately 100% attack success rate in multiple settings with the best FID as 36.61.
- Score: 30.169827029761702
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional adversarial attacks concentrate on manipulating clean examples in
the pixel space by adding adversarial perturbations. By contrast, semantic
adversarial attacks focus on changing semantic attributes of clean examples,
such as color, context, and features, which are more feasible in the real
world. In this paper, we propose a framework to quickly generate a semantic
adversarial attack by leveraging recent diffusion models since semantic
information is included in the latent space of well-trained diffusion models.
Then there are two variants of this framework: 1) the Semantic Transformation
(ST) approach fine-tunes the latent space of the generated image and/or the
diffusion model itself; 2) the Latent Masking (LM) approach masks the latent
space with another target image and local backpropagation-based interpretation
methods. Additionally, the ST approach can be applied in either white-box or
black-box settings. Extensive experiments are conducted on CelebA-HQ and AFHQ
datasets, and our framework demonstrates great fidelity, generalizability, and
transferability compared to other baselines. Our approaches achieve
approximately 100% attack success rate in multiple settings with the best FID
as 36.61. Code is available at
https://github.com/steven202/semantic_adv_via_dm.
Related papers
- Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation [54.96563068182733]
We propose Modality Adaptation with text-to-image Diffusion Models (MADM) for semantic segmentation task.
MADM utilizes text-to-image diffusion models pre-trained on extensive image-text pairs to enhance the model's cross-modality capabilities.
We show that MADM achieves state-of-the-art adaptation performance across various modality tasks, including images to depth, infrared, and event modalities.
arXiv Detail & Related papers (2024-10-29T03:49:40Z) - Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models [29.083402085790016]
We propose a method that coaxes the sampled trajectories of pretrained diffusion models to land on images that fall outside of a reference set.
We achieve this by adding repellency terms to the diffusion SDE throughout the generation trajectory.
We show that adding SPELL to popular diffusion models improves their diversity while impacting their FID only marginally, and performs comparatively better than other recent training-free diversity methods.
arXiv Detail & Related papers (2024-10-08T13:26:32Z) - Imperceptible Face Forgery Attack via Adversarial Semantic Mask [59.23247545399068]
We propose an Adversarial Semantic Mask Attack framework (ASMA) which can generate adversarial examples with good transferability and invisibility.
Specifically, we propose a novel adversarial semantic mask generative model, which can constrain generated perturbations in local semantic regions for good stealthiness.
arXiv Detail & Related papers (2024-06-16T10:38:11Z) - Struggle with Adversarial Defense? Try Diffusion [8.274506117450628]
Adrial attacks induce misclassification by introducing subtle perturbations.
diffusion-based adversarial training often encounters convergence challenges and high computational expenses.
We propose the Truth Maximization Diffusion (TMDC) to overcome these issues.
arXiv Detail & Related papers (2024-04-12T06:52:40Z) - Breaking the Black-Box: Confidence-Guided Model Inversion Attack for
Distribution Shift [0.46040036610482665]
Model inversion attacks (MIAs) seek to infer the private training data of a target classifier by generating synthetic images that reflect the characteristics of the target class.
Previous studies have relied on full access to the target model, which is not practical in real-world scenarios.
This paper proposes a textbfConfidence-textbfGuided textbfModel textbfInversion attack method called CG-MI.
arXiv Detail & Related papers (2024-02-28T03:47:17Z) - Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent
Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space.
Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings.
The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z) - Diffusion Models for Imperceptible and Transferable Adversarial Attack [23.991194050494396]
We propose a novel imperceptible and transferable attack by leveraging both the generative and discriminative power of diffusion models.
Our proposed method, DiffAttack, is the first that introduces diffusion models into the adversarial attack field.
arXiv Detail & Related papers (2023-05-14T16:02:36Z) - Local Black-box Adversarial Attacks: A Query Efficient Approach [64.98246858117476]
Adrial attacks have threatened the application of deep neural networks in security-sensitive scenarios.
We propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks.
We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate.
arXiv Detail & Related papers (2021-01-04T15:32:16Z) - Patch-wise Attack for Fooling Deep Neural Network [153.59832333877543]
We propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models.
We significantly improve the success rate by 9.2% for defense models and 3.7% for normally trained models on average.
arXiv Detail & Related papers (2020-07-14T01:50:22Z) - Perturbing Across the Feature Hierarchy to Improve Standard and Strict
Blackbox Attack Transferability [100.91186458516941]
We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers.
We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance.
We analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
arXiv Detail & Related papers (2020-04-29T16:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.