Rethinking Target Label Conditioning in Adversarial Attacks: A 2D Tensor-Guided Generative Approach
- URL: http://arxiv.org/abs/2504.14137v1
- Date: Sat, 19 Apr 2025 02:08:48 GMT
- Title: Rethinking Target Label Conditioning in Adversarial Attacks: A 2D Tensor-Guided Generative Approach
- Authors: Hangyu Liu, Bo Peng, Pengxiang Ding, Donglin Wang,
- Abstract summary: Multi-target adversarial attacks have garnered significant attention due to their ability to generate adversarial images for multiple target classes simultaneously.<n>To address this gap, we first identify and validate that the semantic feature quality and quantity are critical factors affecting the transferability of targeted attacks.<n>We propose the 2D-TGAF framework, which leverages the powerful generative capabilities of diffusion models to encode target labels into two-dimensional semantic tensors.
- Score: 26.259289475583522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compared to single-target adversarial attacks, multi-target attacks have garnered significant attention due to their ability to generate adversarial images for multiple target classes simultaneously. Existing generative approaches for multi-target attacks mainly analyze the effect of the use of target labels on noise generation from a theoretical perspective, lacking practical validation and comprehensive summarization. To address this gap, we first identify and validate that the semantic feature quality and quantity are critical factors affecting the transferability of targeted attacks: 1) Feature quality refers to the structural and detailed completeness of the implanted target features, as deficiencies may result in the loss of key discriminative information; 2) Feature quantity refers to the spatial sufficiency of the implanted target features, as inadequacy limits the victim model's attention to this feature. Based on these findings, we propose the 2D Tensor-Guided Adversarial Fusion (2D-TGAF) framework, which leverages the powerful generative capabilities of diffusion models to encode target labels into two-dimensional semantic tensors for guiding adversarial noise generation. Additionally, we design a novel masking strategy tailored for the training process, ensuring that parts of the generated noise retain complete semantic information about the target class. Extensive experiments on the standard ImageNet dataset demonstrate that 2D-TGAF consistently surpasses state-of-the-art methods in attack success rates, both on normally trained models and across various defense mechanisms.
Related papers
- AnywhereDoor: Multi-Target Backdoor Attacks on Object Detection [9.539021752700823]
AnywhereDoor is a multi-target backdoor attack for object detection.<n>It allows adversaries to make objects disappear, fabricate new ones or mislabel them, either across all object classes or specific ones.<n>It improves attack success rates by 26% compared to adaptations of existing methods for such flexible control.
arXiv Detail & Related papers (2025-03-09T09:24:24Z) - Twin Trigger Generative Networks for Backdoor Attacks against Object Detection [14.578800906364414]
Object detectors, which are widely used in real-world applications, are vulnerable to backdoor attacks.
Most research on backdoor attacks has focused on image classification, with limited investigation into object detection.
We propose novel twin trigger generative networks to generate invisible triggers for implanting backdoors into models during training, and visible triggers for steady activation during inference.
arXiv Detail & Related papers (2024-11-23T03:46:45Z) - AnywhereDoor: Multi-Target Backdoor Attacks on Object Detection [9.539021752700823]
AnywhereDoor is a multi-target backdoor attack for object detection.<n>It allows adversaries to make objects disappear, fabricate new ones or mislabel them, either across all object classes or specific ones.<n>It improves attack success rates by 26% compared to adaptations of existing methods for such flexible control.
arXiv Detail & Related papers (2024-11-21T15:50:59Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Model X-ray:Detecting Backdoored Models via Decision Boundary [62.675297418960355]
Backdoor attacks pose a significant security vulnerability for deep neural networks (DNNs)
We propose Model X-ray, a novel backdoor detection approach based on the analysis of illustrated two-dimensional (2D) decision boundaries.
Our approach includes two strategies focused on the decision areas dominated by clean samples and the concentration of label distribution.
arXiv Detail & Related papers (2024-02-27T12:42:07Z) - Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement [68.31147013783387]
We observe that the attention mechanism is vulnerable to patch-based adversarial attacks.
In this paper, we propose a Robust Attention Mechanism (RAM) to improve the robustness of the semantic segmentation model.
arXiv Detail & Related papers (2024-01-03T13:58:35Z) - Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z) - LEAT: Towards Robust Deepfake Disruption in Real-World Scenarios via
Latent Ensemble Attack [11.764601181046496]
Deepfakes, malicious visual contents created by generative models, pose an increasingly harmful threat to society.
To proactively mitigate deepfake damages, recent studies have employed adversarial perturbation to disrupt deepfake model outputs.
We propose a simple yet effective disruption method called Latent Ensemble ATtack (LEAT), which attacks the independent latent encoding process.
arXiv Detail & Related papers (2023-07-04T07:00:37Z) - Double Targeted Universal Adversarial Perturbations [83.60161052867534]
We introduce a double targeted universal adversarial perturbations (DT-UAPs) to bridge the gap between the instance-discriminative image-dependent perturbations and the generic universal perturbations.
We show the effectiveness of the proposed DTA algorithm on a wide range of datasets and also demonstrate its potential as a physical attack.
arXiv Detail & Related papers (2020-10-07T09:08:51Z) - GAP++: Learning to generate target-conditioned adversarial examples [28.894143619182426]
Adversarial examples are perturbed inputs which can cause a serious threat for machine learning models.
We propose a more general-purpose framework which infers target-conditioned perturbations dependent on both input image and target label.
Our method achieves superior performance with single target attack models and obtains high fooling rates with small perturbation norms.
arXiv Detail & Related papers (2020-06-09T07:49:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.