Related papers: Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

URL: http://arxiv.org/abs/2301.13356v1
Date: Tue, 31 Jan 2023 01:17:03 GMT
Title: Inference Time Evidences of Adversarial Attacks for Forensic on Transformers
Authors: Hugo Lemarchant, Liangzi Li, Yiming Qian, Yuta Nakashima, Hajime Nagahara
Abstract summary: Vision Transformers (ViTs) are becoming a popular paradigm for vision tasks as they achieve state-of-the-art performance on image classification. This paper presents our first attempt toward detecting adversarial attacks during inference time using the network's input and outputs as well as latent features.
Score: 27.88746727644074
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision Transformers (ViTs) are becoming a very popular paradigm for vision tasks as they achieve state-of-the-art performance on image classification. However, although early works implied that this network structure had increased robustness against adversarial attacks, some works argue ViTs are still vulnerable. This paper presents our first attempt toward detecting adversarial attacks during inference time using the network's input and outputs as well as latent features. We design four quantifications (or derivatives) of input, output, and latent vectors of ViT-based models that provide a signature of the inference, which could be beneficial for the attack detection, and empirically study their behavior over clean samples and adversarial samples. The results demonstrate that the quantifications from input (images) and output (posterior probabilities) are promising for distinguishing clean and adversarial samples, while latent vectors offer less discriminative power, though they give some insights on how adversarial perturbations work.

Related papers

Protego: Detecting Adversarial Examples for Vision Transformers via Intrinsic Capabilities [21.96572543062238]
Transformer models have excelled in natural language tasks, prompting the vision community to explore their implementation in computer vision problems. In this paper, we investigate the attack capabilities of six common adversarial attacks on three pretrained ViT models to reveal the vulnerability of ViT models. To prevent ViT models from adversarial attack, we propose Protego, a detection framework that leverages the transformer intrinsic capabilities to detection adversarial examples.
arXiv Detail & Related papers (2025-01-13T03:54:19Z)
ViTGuard: Attention-aware Detection against Adversarial Examples for Vision Transformer [8.71614629110101]
We propose ViTGuard as a general detection method for defending Vision Transformer (ViT) models against adversarial attacks. ViTGuard uses a Masked Autoencoder (MAE) model to recover randomly masked patches from the unmasked regions. threshold-based detectors leverage distinctive ViT features, including attention maps and classification (token representations) token representations, to distinguish between normal and adversarial samples.
arXiv Detail & Related papers (2024-09-20T18:11:56Z)
How adversarial attacks can disrupt seemingly stable accurate classifiers [76.95145661711514]
Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data. Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data. We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability.
arXiv Detail & Related papers (2023-09-07T12:02:00Z)
Uncertainty-based Detection of Adversarial Attacks in Semantic Segmentation [16.109860499330562]
We introduce an uncertainty-based approach for the detection of adversarial attacks in semantic segmentation. We demonstrate the ability of our approach to detect perturbed images across multiple types of adversarial attacks.
arXiv Detail & Related papers (2023-05-22T08:36:35Z)
Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample. We use metric learning to frame adversarial regularization as an optimal transport problem. Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z)
Adversarial Vulnerability of Temporal Feature Networks for Object Detection [5.525433572437716]
We study whether temporal feature networks for object detection are vulnerable to universal adversarial attacks. We evaluate attacks of two types: imperceptible noise for the whole image and locally-bound adversarial patch. Our experiments on KITTI and nuScenes datasets demonstrate, that a model robustified via K-PGD is able to withstand the studied attacks.
arXiv Detail & Related papers (2022-08-23T07:08:54Z)
Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image Classification [4.843654097048771]
Vision Transformers (ViT) are competing to replace Convolutional Neural Networks (CNN) for various computer vision tasks in medical imaging. Recent works have shown that ViTs are also susceptible to such attacks and suffer significant performance degradation under attack. We propose a novel self-ensembling method to enhance the robustness of ViT in the presence of adversarial attacks.
arXiv Detail & Related papers (2022-08-04T19:02:24Z)
Towards Transferable Adversarial Attacks on Vision Transformers [110.55845478440807]
Vision transformers (ViTs) have demonstrated impressive performance on a series of computer vision tasks, yet they still suffer from adversarial examples. We introduce a dual attack framework, which contains a Pay No Attention (PNA) attack and a PatchOut attack, to improve the transferability of adversarial samples across different ViTs.
arXiv Detail & Related papers (2021-09-09T11:28:25Z)
Exploring Robustness of Unsupervised Domain Adaptation in Semantic Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space. This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z)
Detecting Adversarial Examples by Input Transformations, Defense Perturbations, and Voting [71.57324258813674]
convolutional neural networks (CNNs) have proved to reach super-human performance in visual recognition tasks. CNNs can easily be fooled by adversarial examples, i.e., maliciously-crafted images that force the networks to predict an incorrect output. This paper extensively explores the detection of adversarial examples via image transformations and proposes a novel methodology.
arXiv Detail & Related papers (2021-01-27T14:50:41Z)
Learning to Separate Clusters of Adversarial Representations for Robust Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature. In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property. This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.