Selective and Features based Adversarial Example Detection
- URL: http://arxiv.org/abs/2103.05354v1
- Date: Tue, 9 Mar 2021 11:06:15 GMT
- Title: Selective and Features based Adversarial Example Detection
- Authors: Ahmed Aldahdooh, Wassim Hamidouche, and Olivier D\'eforges
- Abstract summary: Security-sensitive applications that relay on Deep Neural Networks (DNNs) are vulnerable to small perturbations crafted to generate Adversarial Examples (AEs)
We propose a novel unsupervised detection mechanism that uses the selective prediction, processing model layers outputs, and knowledge transfer concepts in a multi-task learning setting.
Experimental results show that the proposed approach achieves comparable results to the state-of-the-art methods against tested attacks in white box scenario and better results in black and gray boxes scenarios.
- Score: 12.443388374869745
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Security-sensitive applications that relay on Deep Neural Networks (DNNs) are
vulnerable to small perturbations crafted to generate Adversarial Examples
(AEs) that are imperceptible to human and cause DNN to misclassify them. Many
defense and detection techniques have been proposed. The state-of-the-art
detection techniques have been designed for specific attacks or broken by
others, need knowledge about the attacks, are not consistent, increase model
parameters overhead, are time-consuming, or have latency in inference time. To
trade off these factors, we propose a novel unsupervised detection mechanism
that uses the selective prediction, processing model layers outputs, and
knowledge transfer concepts in a multi-task learning setting. It is called
Selective and Feature based Adversarial Detection (SFAD). Experimental results
show that the proposed approach achieves comparable results to the
state-of-the-art methods against tested attacks in white box scenario and
better results in black and gray boxes scenarios. Moreover, results show that
SFAD is fully robust against High Confidence Attacks (HCAs) for MNIST and
partially robust for CIFAR-10 datasets.
Related papers
- AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - PASA: Attack Agnostic Unsupervised Adversarial Detection using Prediction & Attribution Sensitivity Analysis [2.5347892611213614]
Deep neural networks for classification are vulnerable to adversarial attacks, where small perturbations to input samples lead to incorrect predictions.
We develop a practical method for this characteristic of model prediction and feature attribution to detect adversarial samples.
Our approach demonstrates competitive performance even when an adversary is aware of the defense mechanism.
arXiv Detail & Related papers (2024-04-12T21:22:21Z) - usfAD Based Effective Unknown Attack Detection Focused IDS Framework [3.560574387648533]
Internet of Things (IoT) and Industrial Internet of Things (IIoT) have led to an increasing range of cyber threats.
For more than a decade, researchers have delved into supervised machine learning techniques to develop Intrusion Detection System (IDS)
IDS trained and tested on known datasets fails in detecting zero-day or unknown attacks.
We propose two strategies for semi-supervised learning based IDS where training samples of attacks are not required.
arXiv Detail & Related papers (2024-03-17T11:49:57Z) - Adversarial Examples Detection with Enhanced Image Difference Features
based on Local Histogram Equalization [20.132066800052712]
We propose an adversarial example detection framework based on a high-frequency information enhancement strategy.
This framework can effectively extract and amplify the feature differences between adversarial examples and normal examples.
arXiv Detail & Related papers (2023-05-08T03:14:01Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - MixDefense: A Defense-in-Depth Framework for Adversarial Example
Detection Based on Statistical and Semantic Analysis [14.313178290347293]
We propose a multilayer defense-in-depth framework for AE detection, namely MixDefense.
We leverage the noise' features extracted from the inputs to discover the statistical difference between natural images and tampered ones for AE detection.
We show that the proposed MixDefense solution outperforms the existing AE detection techniques by a considerable margin.
arXiv Detail & Related papers (2021-04-20T15:57:07Z) - ExAD: An Ensemble Approach for Explanation-based Adversarial Detection [17.455233006559734]
We propose ExAD, a framework to detect adversarial examples using an ensemble of explanation techniques.
We evaluate our approach using six state-of-the-art adversarial attacks on three image datasets.
arXiv Detail & Related papers (2021-03-22T00:53:07Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - Bayesian Optimization with Machine Learning Algorithms Towards Anomaly
Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique.
The performance of the considered algorithms is evaluated using the ISCX 2012 dataset.
Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.