Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial
Examples
- URL: http://arxiv.org/abs/2210.08579v1
- Date: Sun, 16 Oct 2022 16:29:47 GMT
- Title: Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial
Examples
- Authors: Hui Liu, Bo Zhao, Kehuan Zhang, Peng Liu
- Abstract summary: Adversarial examples are generated by adding slight but maliciously crafted perturbations to benign images.
In this paper, we propose an AutoEncoder-based Adversarial Examples detector.
We show empirically that the AEAE is unsupervised and inexpensive against the most state-of-the-art attacks.
- Score: 14.332434280103667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although deep neural networks (DNNs) have shown impressive performance on
many perceptual tasks, they are vulnerable to adversarial examples that are
generated by adding slight but maliciously crafted perturbations to benign
images. Adversarial detection is an important technique for identifying
adversarial examples before they are entered into target DNNs. Previous studies
to detect adversarial examples either targeted specific attacks or required
expensive computation. How design a lightweight unsupervised detector is still
a challenging problem. In this paper, we propose an AutoEncoder-based
Adversarial Examples (AEAE) detector, that can guard DNN models by detecting
adversarial examples with low computation in an unsupervised manner. The AEAE
includes only a shallow autoencoder but plays two roles. First, a well-trained
autoencoder has learned the manifold of benign examples. This autoencoder can
produce a large reconstruction error for adversarial images with large
perturbations, so we can detect significantly perturbed adversarial examples
based on the reconstruction error. Second, the autoencoder can filter out the
small noise and change the DNN's prediction on adversarial examples with small
perturbations. It helps to detect slightly perturbed adversarial examples based
on the prediction distance. To cover these two cases, we utilize the
reconstruction error and prediction distance from benign images to construct a
two-tuple feature set and train an adversarial detector using the isolation
forest algorithm. We show empirically that the AEAE is unsupervised and
inexpensive against the most state-of-the-art attacks. Through the detection in
these two cases, there is nowhere to hide adversarial examples.
Related papers
- HOLMES: to Detect Adversarial Examples with Multiple Detectors [1.455585466338228]
HOLMES is able to distinguish textitunseen adversarial examples from multiple attacks with high accuracy and low false positive rates.
Our effective and inexpensive strategies neither modify original DNN models nor require its internal parameters.
arXiv Detail & Related papers (2024-05-30T11:22:55Z) - New Adversarial Image Detection Based on Sentiment Analysis [37.139957973240264]
adversarial attack models, e.g., DeepFool, are on the rise and outrunning adversarial example detection techniques.
This paper presents a new adversarial example detector that outperforms state-of-the-art detectors in identifying the latest adversarial attacks on image datasets.
arXiv Detail & Related papers (2023-05-03T14:32:21Z) - ADC: Adversarial attacks against object Detection that evade Context
consistency checks [55.8459119462263]
We show that even context consistency checks can be brittle to properly crafted adversarial examples.
We propose an adaptive framework to generate examples that subvert such defenses.
Our results suggest that how to robustly model context and check its consistency, is still an open problem.
arXiv Detail & Related papers (2021-10-24T00:25:09Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Self-Supervised Adversarial Example Detection by Disentangled
Representation [16.98476232162835]
We train an autoencoder, assisted by a discriminator network, over both correctly paired class/semantic features and incorrectly paired class/semantic features to reconstruct benign and counterexamples.
This mimics the behavior of adversarial examples and can reduce the unnecessary generalization ability of autoencoder.
Compared with the state-of-the-art self-supervised detection methods, our method exhibits better performance in various measurements.
arXiv Detail & Related papers (2021-05-08T12:48:18Z) - DAFAR: Detecting Adversaries by Feedback-Autoencoder Reconstruction [7.867922462470315]
DAFAR allows deep learning models to detect adversarial examples in high accuracy and universality.
It transforms imperceptible-perturbation attack on the target network directly into obvious reconstruction-error attack on the feedback autoencoder.
Experiments show that DAFAR is effective against popular and arguably most advanced attacks without losing performance on legitimate samples.
arXiv Detail & Related papers (2021-03-11T06:18:50Z) - Object Detection Made Simpler by Eliminating Heuristic NMS [70.93004137521946]
We show a simple NMS-free, end-to-end object detection framework.
We attain on par or even improved detection accuracy compared with the original one-stage detector.
arXiv Detail & Related papers (2021-01-28T02:38:29Z) - Detecting Adversarial Examples by Input Transformations, Defense
Perturbations, and Voting [71.57324258813674]
convolutional neural networks (CNNs) have proved to reach super-human performance in visual recognition tasks.
CNNs can easily be fooled by adversarial examples, i.e., maliciously-crafted images that force the networks to predict an incorrect output.
This paper extensively explores the detection of adversarial examples via image transformations and proposes a novel methodology.
arXiv Detail & Related papers (2021-01-27T14:50:41Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Category-wise Attack: Transferable Adversarial Examples for Anchor Free
Object Detection [38.813947369401525]
We present an effective and efficient algorithm to generate adversarial examples to attack anchor-free object models.
Surprisingly, the generated adversarial examples it not only able to effectively attack the targeted anchor-free object detector but also to be transferred to attack other object detectors.
arXiv Detail & Related papers (2020-02-10T04:49:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.