Related papers: ViTGuard: Attention-aware Detection against Adversarial Examples for Vision Transformer

ViTGuard: Attention-aware Detection against Adversarial Examples for Vision Transformer

URL: http://arxiv.org/abs/2409.13828v1
Date: Fri, 20 Sep 2024 18:11:56 GMT
Title: ViTGuard: Attention-aware Detection against Adversarial Examples for Vision Transformer
Authors: Shihua Sun, Kenechukwu Nwodo, Shridatt Sugrim, Angelos Stavrou, Haining Wang,
Abstract summary: We propose ViTGuard as a general detection method for defending Vision Transformer (ViT) models against adversarial attacks. ViTGuard uses a Masked Autoencoder (MAE) model to recover randomly masked patches from the unmasked regions. threshold-based detectors leverage distinctive ViT features, including attention maps and classification (token representations) token representations, to distinguish between normal and adversarial samples.
Score: 8.71614629110101
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The use of transformers for vision tasks has challenged the traditional dominant role of convolutional neural networks (CNN) in computer vision (CV). For image classification tasks, Vision Transformer (ViT) effectively establishes spatial relationships between patches within images, directing attention to important areas for accurate predictions. However, similar to CNNs, ViTs are vulnerable to adversarial attacks, which mislead the image classifier into making incorrect decisions on images with carefully designed perturbations. Moreover, adversarial patch attacks, which introduce arbitrary perturbations within a small area, pose a more serious threat to ViTs. Even worse, traditional detection methods, originally designed for CNN models, are impractical or suffer significant performance degradation when applied to ViTs, and they generally overlook patch attacks. In this paper, we propose ViTGuard as a general detection method for defending ViT models against adversarial attacks, including typical attacks where perturbations spread over the entire input and patch attacks. ViTGuard uses a Masked Autoencoder (MAE) model to recover randomly masked patches from the unmasked regions, providing a flexible image reconstruction strategy. Then, threshold-based detectors leverage distinctive ViT features, including attention maps and classification (CLS) token representations, to distinguish between normal and adversarial samples. The MAE model does not involve any adversarial samples during training, ensuring the effectiveness of our detectors against unseen attacks. ViTGuard is compared with seven existing detection methods under nine attacks across three datasets. The evaluation results show the superiority of ViTGuard over existing detectors. Finally, considering the potential detection evasion, we further demonstrate ViTGuard's robustness against adaptive attacks for evasion.

Related papers

Protego: Detecting Adversarial Examples for Vision Transformers via Intrinsic Capabilities [21.96572543062238]
Transformer models have excelled in natural language tasks, prompting the vision community to explore their implementation in computer vision problems. In this paper, we investigate the attack capabilities of six common adversarial attacks on three pretrained ViT models to reveal the vulnerability of ViT models. To prevent ViT models from adversarial attack, we propose Protego, a detection framework that leverages the transformer intrinsic capabilities to detection adversarial examples.
arXiv Detail & Related papers (2025-01-13T03:54:19Z)
Backdoor Attack Against Vision Transformers via Attention Gradient-Based Image Erosion [4.036142985883415]
Vision Transformers (ViTs) have outperformed traditional Convolutional Neural Networks (CNN) across various computer vision tasks. ViTs are vulnerable to backdoor attacks, where an adversary embeds a backdoor into the victim model. We propose an Attention Gradient-based Erosion Backdoor (AGEB) targeted at ViTs.
arXiv Detail & Related papers (2024-10-30T04:06:12Z)
Downstream Transfer Attack: Adversarial Attacks on Downstream Models with Pre-trained Vision Transformers [95.22517830759193]
This paper studies the transferability of such an adversarial vulnerability from a pre-trained ViT model to downstream tasks. We show that DTA achieves an average attack success rate (ASR) exceeding 90%, surpassing existing methods by a huge margin.
arXiv Detail & Related papers (2024-08-03T08:07:03Z)
Query-Efficient Hard-Label Black-Box Attack against Vision Transformers [9.086983253339069]
Vision transformers (ViTs) face similar security risks from adversarial attacks as deep convolutional neural networks (CNNs) This article explores the vulnerability of ViTs against adversarial attacks under a black-box scenario. We propose a novel query-efficient hard-label adversarial attack method called AdvViT.
arXiv Detail & Related papers (2024-06-29T10:09:12Z)
Transferable Adversarial Attacks on Vision Transformers with Token Gradient Regularization [32.908816911260615]
Vision transformers (ViTs) have been successfully deployed in a variety of computer vision tasks, but they are still vulnerable to adversarial samples. transfer-based attacks use a local model to generate adversarial samples and directly transfer them to attack a target black-box model. We propose the Token Gradient Regularization (TGR) method to overcome the shortcomings of existing approaches.
arXiv Detail & Related papers (2023-03-28T06:23:17Z)
Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image Classification [4.843654097048771]
Vision Transformers (ViT) are competing to replace Convolutional Neural Networks (CNN) for various computer vision tasks in medical imaging. Recent works have shown that ViTs are also susceptible to such attacks and suffer significant performance degradation under attack. We propose a novel self-ensembling method to enhance the robustness of ViT in the presence of adversarial attacks.
arXiv Detail & Related papers (2022-08-04T19:02:24Z)
Defending Backdoor Attacks on Vision Transformer via Patch Processing [18.50522247164383]
Vision Transformers (ViTs) have a radically different architecture with significantly less inductive bias than Convolutional Neural Networks. This paper investigates a representative causative attack, i.e., backdoor attacks. We propose an effective method for ViTs to defend both patch-based and blending-based trigger backdoor attacks via patch processing.
arXiv Detail & Related papers (2022-06-24T17:29:47Z)
Towards Transferable Adversarial Attacks on Vision Transformers [110.55845478440807]
Vision transformers (ViTs) have demonstrated impressive performance on a series of computer vision tasks, yet they still suffer from adversarial examples. We introduce a dual attack framework, which contains a Pay No Attention (PNA) attack and a PatchOut attack, to improve the transferability of adversarial samples across different ViTs.
arXiv Detail & Related papers (2021-09-09T11:28:25Z)
On Improving Adversarial Transferability of Vision Transformers [97.17154635766578]
Vision transformers (ViTs) process input images as sequences of patches via self-attention. We study the adversarial feature space of ViT models and their transferability. We introduce two novel strategies specific to the architecture of ViT models.
arXiv Detail & Related papers (2021-06-08T08:20:38Z)
Temporally-Transferable Perturbations: Efficient, One-Shot Adversarial Attacks for Online Visual Object Trackers [81.90113217334424]
We propose a framework to generate a single temporally transferable adversarial perturbation from the object template image only. This perturbation can then be added to every search image, which comes at virtually no cost, and still, successfully fool the tracker.
arXiv Detail & Related papers (2020-12-30T15:05:53Z)
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification [78.51092318750102]
This work proposes to defend ASV systems against adversarial attacks with a separate detection network. A VGG-like binary classification detector is introduced and demonstrated to be effective on detecting adversarial samples.
arXiv Detail & Related papers (2020-06-11T04:31:56Z)
Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning [71.17774313301753]
We explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks. Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples.
arXiv Detail & Related papers (2020-06-05T03:03:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.