A General Framework For Detecting Anomalous Inputs to DNN Classifiers
- URL: http://arxiv.org/abs/2007.15147v3
- Date: Thu, 17 Jun 2021 15:04:47 GMT
- Title: A General Framework For Detecting Anomalous Inputs to DNN Classifiers
- Authors: Jayaram Raghuram, Varun Chandrasekaran, Somesh Jha, Suman Banerjee
- Abstract summary: We propose an unsupervised anomaly detection framework based on the internal deep neural network layer representations.
We evaluate the proposed methods on well-known image classification datasets with strong adversarial attacks and OOD inputs.
- Score: 37.79389209020564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting anomalous inputs, such as adversarial and out-of-distribution (OOD)
inputs, is critical for classifiers (including deep neural networks or DNNs)
deployed in real-world applications. While prior works have proposed various
methods to detect such anomalous samples using information from the internal
layer representations of a DNN, there is a lack of consensus on a principled
approach for the different components of such a detection method. As a result,
often heuristic and one-off methods are applied for different aspects of this
problem. We propose an unsupervised anomaly detection framework based on the
internal DNN layer representations in the form of a meta-algorithm with
configurable components. We proceed to propose specific instantiations for each
component of the meta-algorithm based on ideas grounded in statistical testing
and anomaly detection. We evaluate the proposed methods on well-known image
classification datasets with strong adversarial attacks and OOD inputs,
including an adaptive attack that uses the internal layer representations of
the DNN (often not considered in prior work). Comparisons with five
recently-proposed competing detection methods demonstrates the effectiveness of
our method in detecting adversarial and OOD inputs.
Related papers
- On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - TracInAD: Measuring Influence for Anomaly Detection [0.0]
This paper proposes a novel methodology to flag anomalies based on TracIn.
We test our approach using Variational Autoencoders and show that the average influence of a subsample of training points on a test point can serve as a proxy for abnormality.
arXiv Detail & Related papers (2022-05-03T08:20:15Z) - iDECODe: In-distribution Equivariance for Conformal Out-of-distribution
Detection [24.518698391381204]
Machine learning methods such as deep neural networks (DNNs) often generate incorrect predictions with high confidence.
We propose the new method iDECODe, leveraging in-distribution equivariance for conformal OOD detection.
We demonstrate the efficacy of iDECODe by experiments on image and audio datasets, obtaining state-of-the-art results.
arXiv Detail & Related papers (2022-01-07T05:21:40Z) - A Uniform Framework for Anomaly Detection in Deep Neural Networks [0.5099811144731619]
We consider three classes of anomaly inputs,.
(1) natural inputs from a different distribution than the DNN is trained for, known as Out-of-Distribution (OOD) samples,.
(2) crafted inputs generated from ID by attackers, often known as adversarial (AD) samples, and (3) noise (NS) samples generated from meaningless data.
We propose a framework that aims to detect all these anomalies for a pre-trained DNN.
arXiv Detail & Related papers (2021-10-06T22:42:30Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z) - Anomaly Detection of Test-Time Evasion Attacks using Class-conditional
Generative Adversarial Networks [21.023722317810805]
We propose an attack detector based on classconditional Generative Adversaratives (GAN)
We model the distribution of clean data conditioned on a predicted class label by an Auxiliary GAN (ACGAN)
Experiments on image classification datasets under different TTE attack methods show that our method outperforms state-of-the-art detection methods.
arXiv Detail & Related papers (2021-05-21T02:51:58Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - FADER: Fast Adversarial Example Rejection [19.305796826768425]
Recent defenses have been shown to improve adversarial robustness by detecting anomalous deviations from legitimate training samples at different layer representations.
We introduce FADER, a novel technique for speeding up detection-based methods.
Our experiments outline up to 73x prototypes reduction compared to analyzed detectors for MNIST dataset and up to 50x for CIFAR10 respectively.
arXiv Detail & Related papers (2020-10-18T22:00:11Z) - NADS: Neural Architecture Distribution Search for Uncertainty Awareness [79.18710225716791]
Machine learning (ML) systems often encounter Out-of-Distribution (OoD) errors when dealing with testing data coming from a distribution different from training data.
Existing OoD detection approaches are prone to errors and even sometimes assign higher likelihoods to OoD samples.
We propose Neural Architecture Distribution Search (NADS) to identify common building blocks among all uncertainty-aware architectures.
arXiv Detail & Related papers (2020-06-11T17:39:07Z) - GraN: An Efficient Gradient-Norm Based Detector for Adversarial and
Misclassified Examples [77.99182201815763]
Deep neural networks (DNNs) are vulnerable to adversarial examples and other data perturbations.
GraN is a time- and parameter-efficient method that is easily adaptable to any DNN.
GraN achieves state-of-the-art performance on numerous problem set-ups.
arXiv Detail & Related papers (2020-04-20T10:09:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.