CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks
- URL: http://arxiv.org/abs/2407.10179v3
- Date: Wed, 2 Oct 2024 20:16:06 GMT
- Title: CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks
- Authors: Hao Fang, Jiawei Kong, Bin Chen, Tao Dai, Hao Wu, Shu-Tao Xia,
- Abstract summary: Transferable targeted adversarial attacks aim to mislead models into outputting adversary-specified predictions in black-box scenarios.
textitsingle-target generative attacks train a generator for each target class to generate highly transferable perturbations.
textbfCLIP-guided textbfGenerative textbfNetwork with textbfCross-attention modules (CGNC) to enhance multi-target attacks.
- Score: 52.29186466633699
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transferable targeted adversarial attacks aim to mislead models into outputting adversary-specified predictions in black-box scenarios. Recent studies have introduced \textit{single-target} generative attacks that train a generator for each target class to generate highly transferable perturbations, resulting in substantial computational overhead when handling multiple classes. \textit{Multi-target} attacks address this by training only one class-conditional generator for multiple classes. However, the generator simply uses class labels as conditions, failing to leverage the rich semantic information of the target class. To this end, we design a \textbf{C}LIP-guided \textbf{G}enerative \textbf{N}etwork with \textbf{C}ross-attention modules (CGNC) to enhance multi-target attacks by incorporating textual knowledge of CLIP into the generator. Extensive experiments demonstrate that CGNC yields significant improvements over previous multi-target generative attacks, e.g., a 21.46\% improvement in success rate from ResNet-152 to DenseNet-121. Moreover, we propose a masked fine-tuning mechanism to further strengthen our method in attacking a single class, which surpasses existing single-target methods.
Related papers
- MALT Powers Up Adversarial Attacks [11.713127192480101]
We present a novel adversarial targeting method, textitMALT - Mesoscopic Almost Linearity Targeting, based on medium-scale almost linearity assumptions.
Our attack wins over the current state of the art AutoAttack on the standard benchmark datasets CIFAR-100 and ImageNet.
arXiv Detail & Related papers (2024-07-02T13:02:12Z) - Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications.
Existing works essentially directly optimize the single-level objective w.r.t. surrogate model.
We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z) - Learning diverse attacks on large language models for robust red-teaming and safety tuning [126.32539952157083]
Red-teaming, or identifying prompts that elicit harmful responses, is a critical step in ensuring the safe deployment of large language models.
We show that even with explicit regularization to favor novelty and diversity, existing approaches suffer from mode collapse or fail to generate effective attacks.
We propose to use GFlowNet fine-tuning, followed by a secondary smoothing phase, to train the attacker model to generate diverse and effective attack prompts.
arXiv Detail & Related papers (2024-05-28T19:16:17Z) - Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks [62.34019142949628]
Typographic Attacks, which involve pasting misleading text onto an image, were noted to harm the performance of Vision-Language Models like CLIP.
We introduce two novel and more effective textitSelf-Generated attacks which prompt the LVLM to generate an attack against itself.
Using our benchmark, we uncover that Self-Generated attacks pose a significant threat, reducing LVLM(s) classification performance by up to 33%.
arXiv Detail & Related papers (2024-02-01T14:41:20Z) - Enhancing targeted transferability via feature space fine-tuning [21.131915084053894]
Adrial examples (AEs) have been extensively studied due to their potential for privacy protection and inspiring robust neural networks.
We propose fine-tuning an AE crafted by existing simple iterative attacks to make it transferable across unknown models.
arXiv Detail & Related papers (2024-01-05T09:46:42Z) - Transferable Attack for Semantic Segmentation [59.17710830038692]
adversarial attacks, and observe that the adversarial examples generated from a source model fail to attack the target models.
We propose an ensemble attack for semantic segmentation to achieve more effective attacks with higher transferability.
arXiv Detail & Related papers (2023-07-31T11:05:55Z) - GAMA: Generative Adversarial Multi-Object Scene Attacks [48.33120361498787]
This paper presents the first approach of using generative models for adversarial attacks on multi-object scenes.
We call this attack approach Generative Adversarial Multi-object scene Attacks (GAMA)
arXiv Detail & Related papers (2022-09-20T06:40:54Z) - On Generating Transferable Targeted Perturbations [102.3506210331038]
We propose a new generative approach for highly transferable targeted perturbations.
Our approach matches the perturbed image distribution' with that of the target class, leading to high targeted transferability rates.
arXiv Detail & Related papers (2021-03-26T17:55:28Z) - AI-GAN: Attack-Inspired Generation of Adversarial Examples [14.709927651682783]
Deep neural networks (DNNs) are vulnerable to adversarial examples crafted by adding imperceptible perturbations to inputs.
Recently different attacks and strategies have been proposed, but how to generate adversarial examples perceptually realistic and more efficiently remains unsolved.
This paper proposes a novel framework called Attack-Inspired GAN (AI-GAN), where a generator, a discriminator, and an attacker are trained jointly.
arXiv Detail & Related papers (2020-02-06T10:57:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.