Related papers: Semantic Adversarial Attacks via Diffusion Models

Semantic Adversarial Attacks via Diffusion Models

URL: http://arxiv.org/abs/2309.07398v1
Date: Thu, 14 Sep 2023 02:57:48 GMT
Title: Semantic Adversarial Attacks via Diffusion Models
Authors: Chenan Wang, Jinhao Duan, Chaowei Xiao, Edward Kim, Matthew Stamm, Kaidi Xu
Abstract summary: Semantic adversarial attacks focus on changing semantic attributes of clean examples, such as color, context, and features. We propose a framework to quickly generate a semantic adversarial attack by leveraging recent diffusion models. Our approaches achieve approximately 100% attack success rate in multiple settings with the best FID as 36.61.
Score: 30.169827029761702
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traditional adversarial attacks concentrate on manipulating clean examples in the pixel space by adding adversarial perturbations. By contrast, semantic adversarial attacks focus on changing semantic attributes of clean examples, such as color, context, and features, which are more feasible in the real world. In this paper, we propose a framework to quickly generate a semantic adversarial attack by leveraging recent diffusion models since semantic information is included in the latent space of well-trained diffusion models. Then there are two variants of this framework: 1) the Semantic Transformation (ST) approach fine-tunes the latent space of the generated image and/or the diffusion model itself; 2) the Latent Masking (LM) approach masks the latent space with another target image and local backpropagation-based interpretation methods. Additionally, the ST approach can be applied in either white-box or black-box settings. Extensive experiments are conducted on CelebA-HQ and AFHQ datasets, and our framework demonstrates great fidelity, generalizability, and transferability compared to other baselines. Our approaches achieve approximately 100% attack success rate in multiple settings with the best FID as 36.61. Code is available at https://github.com/steven202/semantic_adv_via_dm.

Related papers

TRAP: Targeted Redirecting of Agentic Preferences [3.6293956720749425]
We introduce TRAP, a generative adversarial framework that manipulates the agent's decision-making using diffusion-based semantic injections.<n>Our method combines negative prompt-based degradation with positive semantic optimization, guided by a Siamese semantic network and layout-aware spatial masking.<n>TRAP achieves a 100% attack success rate on leading models, including LLaVA-34B, Gemma3, and Mistral-3.1.
arXiv Detail & Related papers (2025-05-29T14:57:16Z)
Temporal Consistency Constrained Transferable Adversarial Attacks with Background Mixup for Action Recognition [19.445711629904324]
Action recognition models are vulnerable to adversarial examples, which are transferable across other models trained on the same data modality.<n>We propose a Background Mixup-induced Temporal Consistency (BMTC) attack method for action recognition.<n>Our method significantly boosts the transferability of adversarial examples across several action/image recognition models.
arXiv Detail & Related papers (2025-05-23T12:24:28Z)
A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1 [24.599707290204524]
Transfer-based targeted attacks on large vision-language models (LVLMs) often fail against black-box commercial LVLMs. We propose an approach that refines semantic clarity by encoding explicit semantic details within local regions. Our approach achieves success rates exceeding 90% on GPT-4.5, 4o, and o1, significantly outperforming all prior state-of-the-art attack methods.
arXiv Detail & Related papers (2025-03-13T17:59:55Z)
Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation [54.96563068182733]
We propose Modality Adaptation with text-to-image Diffusion Models (MADM) for semantic segmentation task. MADM utilizes text-to-image diffusion models pre-trained on extensive image-text pairs to enhance the model's cross-modality capabilities. We show that MADM achieves state-of-the-art adaptation performance across various modality tasks, including images to depth, infrared, and event modalities.
arXiv Detail & Related papers (2024-10-29T03:49:40Z)
Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models [29.083402085790016]
We propose a method that coaxes the sampled trajectories of pretrained diffusion models to land on images that fall outside of a reference set. We achieve this by adding repellency terms to the diffusion SDE throughout the generation trajectory. We show that adding SPELL to popular diffusion models improves their diversity while impacting their FID only marginally, and performs comparatively better than other recent training-free diversity methods.
arXiv Detail & Related papers (2024-10-08T13:26:32Z)
SCA: Improve Semantic Consistent in Unrestricted Adversarial Attacks via DDPM Inversion [27.7252951625431]
We propose a novel framework called Semantic-Consistent Unrestricted Adversarial Attacks (SCA)<n>SCA employs an inversion method to extract edit-friendly noise maps and utilizes a Multimodal Large Language Model (MLLM) to provide semantic guidance.<n>Our framework enables the efficient generation of adversarial examples that exhibit minimal discernible semantic changes.
arXiv Detail & Related papers (2024-10-03T06:25:53Z)
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z)
Imperceptible Face Forgery Attack via Adversarial Semantic Mask [59.23247545399068]
We propose an Adversarial Semantic Mask Attack framework (ASMA) which can generate adversarial examples with good transferability and invisibility. Specifically, we propose a novel adversarial semantic mask generative model, which can constrain generated perturbations in local semantic regions for good stealthiness.
arXiv Detail & Related papers (2024-06-16T10:38:11Z)
Struggle with Adversarial Defense? Try Diffusion [8.274506117450628]
Adrial attacks induce misclassification by introducing subtle perturbations. diffusion-based adversarial training often encounters convergence challenges and high computational expenses. We propose the Truth Maximization Diffusion (TMDC) to overcome these issues.
arXiv Detail & Related papers (2024-04-12T06:52:40Z)
Breaking the Black-Box: Confidence-Guided Model Inversion Attack for Distribution Shift [0.46040036610482665]
Model inversion attacks (MIAs) seek to infer the private training data of a target classifier by generating synthetic images that reflect the characteristics of the target class. Previous studies have relied on full access to the target model, which is not practical in real-world scenarios. This paper proposes a textbfConfidence-textbfGuided textbfModel textbfInversion attack method called CG-MI.
arXiv Detail & Related papers (2024-02-28T03:47:17Z)
Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space. Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings. The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z)
Diffusion Models for Imperceptible and Transferable Adversarial Attack [23.991194050494396]
We propose a novel imperceptible and transferable attack by leveraging both the generative and discriminative power of diffusion models. Our proposed method, DiffAttack, is the first that introduces diffusion models into the adversarial attack field.
arXiv Detail & Related papers (2023-05-14T16:02:36Z)
Local Black-box Adversarial Attacks: A Query Efficient Approach [64.98246858117476]
Adrial attacks have threatened the application of deep neural networks in security-sensitive scenarios. We propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks. We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate.
arXiv Detail & Related papers (2021-01-04T15:32:16Z)
Perturbing Across the Feature Hierarchy to Improve Standard and Strict Blackbox Attack Transferability [100.91186458516941]
We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers. We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance. We analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
arXiv Detail & Related papers (2020-04-29T16:00:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.