FADER: Fast Adversarial Example Rejection
- URL: http://arxiv.org/abs/2010.09119v1
- Date: Sun, 18 Oct 2020 22:00:11 GMT
- Title: FADER: Fast Adversarial Example Rejection
- Authors: Francesco Crecchi, Marco Melis, Angelo Sotgiu, Davide Bacciu, Battista
Biggio
- Abstract summary: Recent defenses have been shown to improve adversarial robustness by detecting anomalous deviations from legitimate training samples at different layer representations.
We introduce FADER, a novel technique for speeding up detection-based methods.
Our experiments outline up to 73x prototypes reduction compared to analyzed detectors for MNIST dataset and up to 50x for CIFAR10 respectively.
- Score: 19.305796826768425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks are vulnerable to adversarial examples, i.e.,
carefully-crafted inputs that mislead classification at test time. Recent
defenses have been shown to improve adversarial robustness by detecting
anomalous deviations from legitimate training samples at different layer
representations - a behavior normally exhibited by adversarial attacks. Despite
technical differences, all aforementioned methods share a common backbone
structure that we formalize and highlight in this contribution, as it can help
in identifying promising research directions and drawbacks of existing methods.
The first main contribution of this work is the review of these detection
methods in the form of a unifying framework designed to accommodate both
existing defenses and newer ones to come. In terms of drawbacks, the
overmentioned defenses require comparing input samples against an oversized
number of reference prototypes, possibly at different representation layers,
dramatically worsening the test-time efficiency. Besides, such defenses are
typically based on ensembling classifiers with heuristic methods, rather than
optimizing the whole architecture in an end-to-end manner to better perform
detection. As a second main contribution of this work, we introduce FADER, a
novel technique for speeding up detection-based methods. FADER overcome the
issues above by employing RBF networks as detectors: by fixing the number of
required prototypes, the runtime complexity of adversarial examples detectors
can be controlled. Our experiments outline up to 73x prototypes reduction
compared to analyzed detectors for MNIST dataset and up to 50x for CIFAR10
dataset respectively, without sacrificing classification accuracy on both clean
and adversarial data.
Related papers
- AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - Advancing Adversarial Robustness Through Adversarial Logit Update [10.041289551532804]
Adversarial training and adversarial purification are among the most widely recognized defense strategies.
We propose a new principle, namely Adversarial Logit Update (ALU), to infer adversarial sample's labels.
Our solution achieves superior performance compared to state-of-the-art methods against a wide range of adversarial attacks.
arXiv Detail & Related papers (2023-08-29T07:13:31Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Adversarial Examples Detection with Enhanced Image Difference Features
based on Local Histogram Equalization [20.132066800052712]
We propose an adversarial example detection framework based on a high-frequency information enhancement strategy.
This framework can effectively extract and amplify the feature differences between adversarial examples and normal examples.
arXiv Detail & Related papers (2023-05-08T03:14:01Z) - Benchmarking Deep Models for Salient Object Detection [67.07247772280212]
We construct a general SALient Object Detection (SALOD) benchmark to conduct a comprehensive comparison among several representative SOD methods.
In the above experiments, we find that existing loss functions usually specialized in some metrics but reported inferior results on the others.
We propose a novel Edge-Aware (EA) loss that promotes deep networks to learn more discriminative features by integrating both pixel- and image-level supervision signals.
arXiv Detail & Related papers (2022-02-07T03:43:16Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - ADC: Adversarial attacks against object Detection that evade Context
consistency checks [55.8459119462263]
We show that even context consistency checks can be brittle to properly crafted adversarial examples.
We propose an adaptive framework to generate examples that subvert such defenses.
Our results suggest that how to robustly model context and check its consistency, is still an open problem.
arXiv Detail & Related papers (2021-10-24T00:25:09Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Adversarially Robust One-class Novelty Detection [83.1570537254877]
We show that existing novelty detectors are susceptible to adversarial examples.
We propose a defense strategy that manipulates the latent space of novelty detectors to improve the robustness against adversarial examples.
arXiv Detail & Related papers (2021-08-25T10:41:29Z) - Selective and Features based Adversarial Example Detection [12.443388374869745]
Security-sensitive applications that relay on Deep Neural Networks (DNNs) are vulnerable to small perturbations crafted to generate Adversarial Examples (AEs)
We propose a novel unsupervised detection mechanism that uses the selective prediction, processing model layers outputs, and knowledge transfer concepts in a multi-task learning setting.
Experimental results show that the proposed approach achieves comparable results to the state-of-the-art methods against tested attacks in white box scenario and better results in black and gray boxes scenarios.
arXiv Detail & Related papers (2021-03-09T11:06:15Z) - A General Framework For Detecting Anomalous Inputs to DNN Classifiers [37.79389209020564]
We propose an unsupervised anomaly detection framework based on the internal deep neural network layer representations.
We evaluate the proposed methods on well-known image classification datasets with strong adversarial attacks and OOD inputs.
arXiv Detail & Related papers (2020-07-29T22:57:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.