Detecting and Recovering Adversarial Examples from Extracting Non-robust
and Highly Predictive Adversarial Perturbations
- URL: http://arxiv.org/abs/2206.15128v1
- Date: Thu, 30 Jun 2022 08:48:28 GMT
- Title: Detecting and Recovering Adversarial Examples from Extracting Non-robust
and Highly Predictive Adversarial Perturbations
- Authors: Mingyu Dong and Jiahao Chen and Diqun Yan and Jingxing Gao and Li Dong
and Rangding Wang
- Abstract summary: adversarial examples (AEs) are maliciously designed to fool target models.
Deep neural networks (DNNs) have been shown to be vulnerable against adversarial examples.
We propose a model-free AEs detection method, the whole process of which is free from querying the victim model.
- Score: 15.669678743693947
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) have been shown to be vulnerable against
adversarial examples (AEs) which are maliciously designed to fool target
models. The normal examples (NEs) added with imperceptible adversarial
perturbation, can be a security threat to DNNs. Although the existing AEs
detection methods have achieved a high accuracy, they failed to exploit the
information of the AEs detected. Thus, based on high-dimension perturbation
extraction, we propose a model-free AEs detection method, the whole process of
which is free from querying the victim model. Research shows that DNNs are
sensitive to the high-dimension features. The adversarial perturbation hiding
in the adversarial example belongs to the high-dimension feature which is
highly predictive and non-robust. DNNs learn more details from high-dimension
data than others. In our method, the perturbation extractor can extract the
adversarial perturbation from AEs as high-dimension feature, then the trained
AEs discriminator determines whether the input is an AE. Experimental results
show that the proposed method can not only detect the adversarial examples with
high accuracy, but also detect the specific category of the AEs. Meanwhile, the
extracted perturbation can be used to recover the AEs to NEs.
Related papers
- Effort: Efficient Orthogonal Modeling for Generalizable AI-Generated Image Detection [66.16595174895802]
Existing AI-generated image (AIGI) detection methods often suffer from limited generalization performance.
In this paper, we identify a crucial yet previously overlooked asymmetry phenomenon in AIGI detection.
arXiv Detail & Related papers (2024-11-23T19:10:32Z) - Eliminating Catastrophic Overfitting Via Abnormal Adversarial Examples Regularization [50.43319961935526]
Single-step adversarial training (SSAT) has demonstrated the potential to achieve both efficiency and robustness.
SSAT suffers from catastrophic overfitting (CO), a phenomenon that leads to a severely distorted classifier.
In this work, we observe that some adversarial examples generated on the SSAT-trained network exhibit anomalous behaviour.
arXiv Detail & Related papers (2024-04-11T22:43:44Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial
Examples [14.332434280103667]
Adversarial examples are generated by adding slight but maliciously crafted perturbations to benign images.
In this paper, we propose an AutoEncoder-based Adversarial Examples detector.
We show empirically that the AEAE is unsupervised and inexpensive against the most state-of-the-art attacks.
arXiv Detail & Related papers (2022-10-16T16:29:47Z) - Be Your Own Neighborhood: Detecting Adversarial Example by the
Neighborhood Relations Built on Self-Supervised Learning [64.78972193105443]
This paper presents a novel AE detection framework, named trustworthy for predictions.
performs the detection by distinguishing the AE's abnormal relation with its augmented versions.
An off-the-shelf Self-Supervised Learning (SSL) model is used to extract the representation and predict the label.
arXiv Detail & Related papers (2022-08-31T08:18:44Z) - Do autoencoders need a bottleneck for anomaly detection? [78.24964622317634]
Learning the identity function renders the AEs useless for anomaly detection.
In this work, we investigate the value of non-bottlenecked AEs.
We propose the infinitely-wide AEs as an extreme example of non-bottlenecked AEs.
arXiv Detail & Related papers (2022-02-25T11:57:58Z) - What You See is Not What the Network Infers: Detecting Adversarial
Examples Based on Semantic Contradiction [14.313178290347293]
Adversarial examples (AEs) pose severe threats to the applications of deep neural networks (DNNs) to safety-critical domains.
We propose a novel AE detection framework based on the very nature of AEs.
We show that ContraNet outperforms existing solutions by a large margin, especially under adaptive attacks.
arXiv Detail & Related papers (2022-01-24T13:15:31Z) - Probabilistic Robust Autoencoders for Anomaly Detection [7.362415721170984]
We propose a new type of autoencoder (AE) which we term Probabilistic Robust autoencoder (PRAE)
PRAE is designed to simultaneously remove outliers and identify a low-dimensional representation for the inlier samples.
We prove that the solution to PRAE is equivalent to the solution of RAE and demonstrate using extensive simulations that PRAE is at par with state-of-the-art methods for anomaly detection.
arXiv Detail & Related papers (2021-10-01T15:46:38Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - MixDefense: A Defense-in-Depth Framework for Adversarial Example
Detection Based on Statistical and Semantic Analysis [14.313178290347293]
We propose a multilayer defense-in-depth framework for AE detection, namely MixDefense.
We leverage the noise' features extracted from the inputs to discover the statistical difference between natural images and tampered ones for AE detection.
We show that the proposed MixDefense solution outperforms the existing AE detection techniques by a considerable margin.
arXiv Detail & Related papers (2021-04-20T15:57:07Z) - Selective and Features based Adversarial Example Detection [12.443388374869745]
Security-sensitive applications that relay on Deep Neural Networks (DNNs) are vulnerable to small perturbations crafted to generate Adversarial Examples (AEs)
We propose a novel unsupervised detection mechanism that uses the selective prediction, processing model layers outputs, and knowledge transfer concepts in a multi-task learning setting.
Experimental results show that the proposed approach achieves comparable results to the state-of-the-art methods against tested attacks in white box scenario and better results in black and gray boxes scenarios.
arXiv Detail & Related papers (2021-03-09T11:06:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.