Related papers: EAD: an ensemble approach to detect adversarial examples from the hidden features of deep neural networks

EAD: an ensemble approach to detect adversarial examples from the hidden features of deep neural networks

URL: http://arxiv.org/abs/2111.12631v2
Date: Thu, 25 Nov 2021 11:24:28 GMT
Title: EAD: an ensemble approach to detect adversarial examples from the hidden features of deep neural networks
Authors: Francesco Craighero, Fabrizio Angaroni, Fabio Stella, Chiara Damiani, Marco Antoniotti, Alex Graudenzi
Abstract summary: We propose an Ensemble Adversarial Detector (EAD) for the identification of adversarial examples. EAD combines multiple detectors that exploit distinct properties of the input instances in the internal representation of a pre-trained Deep Neural Network (DNN) We show that EAD achieves the best AUROC and AUPR in the large majority of the settings and comparable performance in the others.
Score: 1.3212032015497979
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the key challenges in Deep Learning is the definition of effective strategies for the detection of adversarial examples. To this end, we propose a novel approach named Ensemble Adversarial Detector (EAD) for the identification of adversarial examples, in a standard multiclass classification scenario. EAD combines multiple detectors that exploit distinct properties of the input instances in the internal representation of a pre-trained Deep Neural Network (DNN). Specifically, EAD integrates the state-of-the-art detectors based on Mahalanobis distance and on Local Intrinsic Dimensionality (LID) with a newly introduced method based on One-class Support Vector Machines (OSVMs). Although all constituting methods assume that the greater the distance of a test instance from the set of correctly classified training instances, the higher its probability to be an adversarial example, they differ in the way such distance is computed. In order to exploit the effectiveness of the different methods in capturing distinct properties of data distributions and, accordingly, efficiently tackle the trade-off between generalization and overfitting, EAD employs detector-specific distance scores as features of a logistic regression classifier, after independent hyperparameters optimization. We evaluated the EAD approach on distinct datasets (CIFAR-10, CIFAR-100 and SVHN) and models (ResNet and DenseNet) and with regard to four adversarial attacks (FGSM, BIM, DeepFool and CW), also by comparing with competing approaches. Overall, we show that EAD achieves the best AUROC and AUPR in the large majority of the settings and comparable performance in the others. The improvement over the state-of-the-art, and the possibility to easily extend EAD to include any arbitrary set of detectors, pave the way to a widespread adoption of ensemble approaches in the broad field of adversarial example detection.

Related papers

Self-Supervised Representation Learning for Adversarial Attack Detection [6.528181610035978]
Supervised learning-based adversarial attack detection methods rely on a large number of labeled data. We propose a self-supervised representation learning framework for the adversarial attack detection task to address this drawback.
arXiv Detail & Related papers (2024-07-05T09:37:16Z)
Small Object Detection via Coarse-to-fine Proposal Generation and Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning. CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z)
A Unified Wasserstein Distributional Robustness Framework for Adversarial Training [24.411703133156394]
This paper presents a unified framework that connects Wasserstein distributional robustness with current state-of-the-art AT methods. We introduce a new Wasserstein cost function and a new series of risk functions, with which we show that standard AT methods are special cases of their counterparts in our framework. This connection leads to an intuitive relaxation and generalization of existing AT methods and facilitates the development of a new family of distributional robustness AT-based algorithms.
arXiv Detail & Related papers (2022-02-27T19:40:29Z)
Benchmarking Deep Models for Salient Object Detection [67.07247772280212]
We construct a general SALient Object Detection (SALOD) benchmark to conduct a comprehensive comparison among several representative SOD methods. In the above experiments, we find that existing loss functions usually specialized in some metrics but reported inferior results on the others. We propose a novel Edge-Aware (EA) loss that promotes deep networks to learn more discriminative features by integrating both pixel- and image-level supervision signals.
arXiv Detail & Related papers (2022-02-07T03:43:16Z)
PARL: Enhancing Diversity of Ensemble Networks to Resist Adversarial Attacks via Pairwise Adversarially Robust Loss Function [13.417003144007156]
adversarial attacks tend to rely on the principle of transferability. Ensemble methods against adversarial attacks demonstrate that an adversarial example is less likely to mislead multiple classifiers. Recent ensemble methods have either been shown to be vulnerable to stronger adversaries or shown to lack an end-to-end evaluation.
arXiv Detail & Related papers (2021-12-09T14:26:13Z)
Towards A Conceptually Simple Defensive Approach for Few-shot classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks. We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering. Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z)
Learning to Detect Adversarial Examples Based on Class Scores [0.8411385346896413]
We take a closer look at adversarial attack detection based on the class scores of an already trained classification model. We propose to train a support vector machine (SVM) on the class scores to detect adversarial examples. We show that our approach yields an improved detection rate compared to an existing method, whilst being easy to implement.
arXiv Detail & Related papers (2021-07-09T13:29:54Z)
Self-Supervised Adversarial Example Detection by Disentangled Representation [16.98476232162835]
We train an autoencoder, assisted by a discriminator network, over both correctly paired class/semantic features and incorrectly paired class/semantic features to reconstruct benign and counterexamples. This mimics the behavior of adversarial examples and can reduce the unnecessary generalization ability of autoencoder. Compared with the state-of-the-art self-supervised detection methods, our method exhibits better performance in various measurements.
arXiv Detail & Related papers (2021-05-08T12:48:18Z)
Adversarial Unsupervised Domain Adaptation Guided with Deep Clustering for Face Presentation Attack Detection [0.8701566919381223]
Face Presentation Attack Detection (PAD) has drawn increasing attentions to secure the face recognition systems. We propose an end-to-end learning framework based on Domain Adaptation (DA) to improve PAD generalization capability.
arXiv Detail & Related papers (2021-02-13T05:34:40Z)
Detection of Adversarial Supports in Few-shot Classifiers Using Feature Preserving Autoencoders and Self-Similarity [89.26308254637702]
We propose a detection strategy to highlight adversarial support sets. We make use of feature preserving autoencoder filtering and also the concept of self-similarity of a support set to perform this detection. Our method is attack-agnostic and also the first to explore detection for few-shot classifiers to the best of our knowledge.
arXiv Detail & Related papers (2020-12-09T14:13:41Z)
Cross-domain Face Presentation Attack Detection via Multi-domain Disentangled Representation Learning [109.42987031347582]
Face presentation attack detection (PAD) has been an urgent problem to be solved in the face recognition systems. We propose an efficient disentangled representation learning for cross-domain face PAD. Our approach consists of disentangled representation learning (DR-Net) and multi-domain learning (MD-Net)
arXiv Detail & Related papers (2020-04-04T15:45:14Z)
Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.