Improving the Transferability of Adversarial Examples with Restructure
Embedded Patches
- URL: http://arxiv.org/abs/2204.12680v1
- Date: Wed, 27 Apr 2022 03:22:55 GMT
- Title: Improving the Transferability of Adversarial Examples with Restructure
Embedded Patches
- Authors: Huipeng Zhou, Yu-an Tan, Yajie Wang, Haoran Lyu, Shangbo Wu and
Yuanzhang Li
- Abstract summary: We attack the unique self-attention mechanism in ViTs by restructuring the embedded patches of the input.
Our method generates adversarial examples on white-box ViTs with higher transferability and higher image quality.
- Score: 4.476012751070559
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vision transformers (ViTs) have demonstrated impressive performance in
various computer vision tasks. However, the adversarial examples generated by
ViTs are challenging to transfer to other networks with different structures.
Recent attack methods do not consider the specificity of ViTs architecture and
self-attention mechanism, which leads to poor transferability of the generated
adversarial samples by ViTs. We attack the unique self-attention mechanism in
ViTs by restructuring the embedded patches of the input. The restructured
embedded patches enable the self-attention mechanism to obtain more diverse
patches connections and help ViTs keep regions of interest on the object.
Therefore, we propose an attack method against the unique self-attention
mechanism in ViTs, called Self-Attention Patches Restructure (SAPR). Our method
is simple to implement yet efficient and applicable to any self-attention based
network and gradient transferability-based attack methods. We evaluate attack
transferability on black-box models with different structures. The result show
that our method generates adversarial examples on white-box ViTs with higher
transferability and higher image quality. Our research advances the development
of black-box transfer attacks on ViTs and demonstrates the feasibility of using
white-box ViTs to attack other black-box models.
Related papers
- Attacking Transformers with Feature Diversity Adversarial Perturbation [19.597912600568026]
We present a label-free white-box attack approach for ViT-based models that exhibits strong transferability to various black box models.
Our inspiration comes from the feature collapse phenomenon in ViTs, where the critical attention mechanism overly depends on the low-frequency component of features.
arXiv Detail & Related papers (2024-03-10T00:55:58Z) - Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image
Classification [4.843654097048771]
Vision Transformers (ViT) are competing to replace Convolutional Neural Networks (CNN) for various computer vision tasks in medical imaging.
Recent works have shown that ViTs are also susceptible to such attacks and suffer significant performance degradation under attack.
We propose a novel self-ensembling method to enhance the robustness of ViT in the presence of adversarial attacks.
arXiv Detail & Related papers (2022-08-04T19:02:24Z) - Self-Distilled Vision Transformer for Domain Generalization [58.76055100157651]
Vision transformers (ViTs) are challenging the supremacy of CNNs on standard benchmarks.
We propose a simple DG approach for ViTs, coined as self-distillation for ViTs.
We empirically demonstrate notable performance gains with different DG baselines and various ViT backbones in five challenging datasets.
arXiv Detail & Related papers (2022-07-25T17:57:05Z) - Defending Backdoor Attacks on Vision Transformer via Patch Processing [18.50522247164383]
Vision Transformers (ViTs) have a radically different architecture with significantly less inductive bias than Convolutional Neural Networks.
This paper investigates a representative causative attack, i.e., backdoor attacks.
We propose an effective method for ViTs to defend both patch-based and blending-based trigger backdoor attacks via patch processing.
arXiv Detail & Related papers (2022-06-24T17:29:47Z) - Deeper Insights into ViTs Robustness towards Common Corruptions [82.79764218627558]
We investigate how CNN-like architectural designs and CNN-based data augmentation strategies impact on ViTs' robustness towards common corruptions.
We demonstrate that overlapping patch embedding and convolutional Feed-Forward Network (FFN) boost performance on robustness.
We also introduce a novel conditional method enabling input-varied augmentations from two angles.
arXiv Detail & Related papers (2022-04-26T08:22:34Z) - Self-slimmed Vision Transformer [52.67243496139175]
Vision transformers (ViTs) have become the popular structures and outperformed convolutional neural networks (CNNs) on various vision tasks.
We propose a generic self-slimmed learning approach for vanilla ViTs, namely SiT.
Specifically, we first design a novel Token Slimming Module (TSM), which can boost the inference efficiency of ViTs.
arXiv Detail & Related papers (2021-11-24T16:48:57Z) - Towards Transferable Adversarial Attacks on Vision Transformers [110.55845478440807]
Vision transformers (ViTs) have demonstrated impressive performance on a series of computer vision tasks, yet they still suffer from adversarial examples.
We introduce a dual attack framework, which contains a Pay No Attention (PNA) attack and a PatchOut attack, to improve the transferability of adversarial samples across different ViTs.
arXiv Detail & Related papers (2021-09-09T11:28:25Z) - On Improving Adversarial Transferability of Vision Transformers [97.17154635766578]
Vision transformers (ViTs) process input images as sequences of patches via self-attention.
We study the adversarial feature space of ViT models and their transferability.
We introduce two novel strategies specific to the architecture of ViT models.
arXiv Detail & Related papers (2021-06-08T08:20:38Z) - On the Adversarial Robustness of Visual Transformers [129.29523847765952]
This work provides the first and comprehensive study on the robustness of vision transformers (ViTs) against adversarial perturbations.
Tested on various white-box and transfer attack settings, we find that ViTs possess better adversarial robustness when compared with convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-03-29T14:48:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.