Related papers: Breaking Dataset Boundaries: Class-Agnostic Targeted Adversarial Attacks

Breaking Dataset Boundaries: Class-Agnostic Targeted Adversarial Attacks

URL: http://arxiv.org/abs/2505.20782v1
Date: Tue, 27 May 2025 06:39:29 GMT
Title: Breaking Dataset Boundaries: Class-Agnostic Targeted Adversarial Attacks
Authors: Taïga Gonçalves, Tomo Miyazaki, Shinichiro Omachi,
Abstract summary: Cross-Domain Multi-Targeted Attack (CD-MTA)<n>Method for generating adversarial examples that mislead image classifiers toward any target class, including those not seen during training.<n>Replaces class-level supervision with an image-based conditional input and introduces class-agnostic losses that align the perturbed and target images in the feature space.
Score: 3.2771631221674333
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Cross-Domain Multi-Targeted Attack (CD-MTA), a method for generating adversarial examples that mislead image classifiers toward any target class, including those not seen during training. Traditional targeted attacks are limited to one class per model, requiring expensive retraining for each target. Multi-targeted attacks address this by introducing a perturbation generator with a conditional input to specify the target class. However, existing methods are constrained to classes observed during training and require access to the black-box model's training data--introducing a form of data leakage that undermines realistic evaluation in practical black-box scenarios. We identify overreliance on class embeddings as a key limitation, leading to overfitting and poor generalization to unseen classes. To address this, CD-MTA replaces class-level supervision with an image-based conditional input and introduces class-agnostic losses that align the perturbed and target images in the feature space. This design removes dependence on class semantics, thereby enabling generalization to unseen classes across datasets. Experiments on ImageNet and seven other datasets show that CD-MTA outperforms prior multi-targeted attacks in both standard and cross-domain settings--without accessing the black-box model's training data.

Related papers

Benchmarking Unified Face Attack Detection via Hierarchical Prompt Tuning [58.16354555208417]
Presentation Attack Detection and Face Forgery Detection are designed to protect face data from physical media-based Presentation Attacks and digital editing-based DeepFakes respectively.<n>Separate training of these two models makes them vulnerable to unknown attacks and burdens deployment environments.<n>We present a novel Visual-Language Model-based Hierarchical Prompt Tuning Framework (HiPTune) that adaptively explores multiple classification criteria from different semantic spaces.
arXiv Detail & Related papers (2025-05-19T16:35:45Z)
Learning to Learn Transferable Generative Attack for Person Re-Identification [17.26567195924685]
Existing attacks merely consider cross-dataset and cross-model transferability, ignoring the cross-test capability to perturb models trained in different domains.<n>To powerfully examine the robustness of real-world re-id models, the Meta Transferable Generative Attack (MTGA) method is proposed.<n>Our MTGA outperforms the SOTA methods by 21.5% and 11.3% on mean mAP drop rate, respectively.
arXiv Detail & Related papers (2024-09-06T11:57:17Z)
CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks [52.29186466633699]
Transferable targeted adversarial attacks aim to mislead models into outputting adversary-specified predictions in black-box scenarios. textitsingle-target generative attacks train a generator for each target class to generate highly transferable perturbations. textbfCLIP-guided textbfGenerative textbfNetwork with textbfCross-attention modules (CGNC) to enhance multi-target attacks.
arXiv Detail & Related papers (2024-07-14T12:30:32Z)
Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion Model [14.834360664780709]
Model attacks (MIAs) aim to recover private data from inaccessible training sets of deep learning models. This paper develops a novel MIA method, leveraging a conditional diffusion model (CDM) to recover representative samples under the target label. Experimental results show that this method can generate similar and accurate samples to the target label, outperforming generators of previous approaches.
arXiv Detail & Related papers (2023-07-17T12:14:24Z)
Pseudo Label-Guided Model Inversion Attack via Conditional Generative Adversarial Network [102.21368201494909]
Model inversion (MI) attacks have raised increasing concerns about privacy. Recent MI attacks leverage a generative adversarial network (GAN) as an image prior to narrow the search space. We propose Pseudo Label-Guided MI (PLG-MI) attack via conditional GAN (cGAN)
arXiv Detail & Related papers (2023-02-20T07:29:34Z)
Query Efficient Cross-Dataset Transferable Black-Box Attack on Action Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems. We propose a new attack on action recognition that addresses these shortcomings by generating perturbations. Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z)
Label-Only Model Inversion Attacks via Boundary Repulsion [12.374249336222906]
We introduce an algorithm to invert private training data using only the target model's predicted labels. Using the example of face recognition, we show that the images reconstructed by BREP-MI successfully reproduce the semantics of the private training data.
arXiv Detail & Related papers (2022-03-03T18:57:57Z)
Can Targeted Adversarial Examples Transfer When the Source and Target Models Have No Label Space Overlap? [36.96777303738315]
We design blackbox transfer-based targeted adversarial attacks for an environment where the attacker's source model and the target blackbox model may have disjoint label spaces and training datasets. Our methodology begins with the construction of a class correspondence matrix between the whitebox and blackbox label sets. We show that our transfer attacks serve as powerful adversarial priors when integrated with query-based methods.
arXiv Detail & Related papers (2021-03-17T21:21:44Z)
Towards Class-Oriented Poisoning Attacks Against Neural Networks [1.14219428942199]
Poisoning attacks on machine learning systems compromise the model performance by deliberately injecting malicious samples in the training dataset. We propose a class-oriented poisoning attack that is capable of forcing the corrupted model to predict in two specific ways. To maximize the adversarial effect as well as reduce the computational complexity of poisoned data generation, we propose a gradient-based framework.
arXiv Detail & Related papers (2020-07-31T19:27:37Z)
Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-Identification [64.37745443119942]
This paper jointly enforces visual and temporal consistency in the combination of a local one-hot classification and a global multi-class classification. Experimental results on three large-scale ReID datasets demonstrate the superiority of proposed method in both unsupervised and unsupervised domain adaptive ReID tasks.
arXiv Detail & Related papers (2020-07-21T14:31:27Z)
Perturbing Across the Feature Hierarchy to Improve Standard and Strict Blackbox Attack Transferability [100.91186458516941]
We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers. We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance. We analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
arXiv Detail & Related papers (2020-04-29T16:00:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.