Adversarial Examples Detection and Analysis with Layer-wise Autoencoders
- URL: http://arxiv.org/abs/2006.10013v1
- Date: Wed, 17 Jun 2020 17:17:54 GMT
- Title: Adversarial Examples Detection and Analysis with Layer-wise Autoencoders
- Authors: Bartosz W\'ojcik, Pawe{\l} Morawiecki, Marek \'Smieja, Tomasz
Krzy\.zek, Przemys{\l}aw Spurek, Jacek Tabor
- Abstract summary: We present a mechanism for detecting adversarial examples based on data representations taken from the hidden layers of the target network.
This allows us to describe the manifold of true data and decide whether a given example has the same characteristics as true data.
It also gives us insight into the behavior of adversarial examples and their flow through the layers of a deep neural network.
- Score: 11.048707408233724
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a mechanism for detecting adversarial examples based on data
representations taken from the hidden layers of the target network. For this
purpose, we train individual autoencoders at intermediate layers of the target
network. This allows us to describe the manifold of true data and, in
consequence, decide whether a given example has the same characteristics as
true data. It also gives us insight into the behavior of adversarial examples
and their flow through the layers of a deep neural network. Experimental
results show that our method outperforms the state of the art in supervised and
unsupervised settings.
Related papers
- Hypergraph Topological Features for Autoencoder-Based Intrusion Detection for Cybersecurity Data [0.8046432252929225]
We argue that when hypergraphs are used to capture multi-way local relations of data, their resulting topological features describe global behaviour.
We propose two such potential pipelines for cybersecurity data, one that uses an autoencoder directly to determine network intrusions, and one that de-noises input data for a persistent homology system, PHANTOM.
arXiv Detail & Related papers (2023-11-09T20:05:10Z) - A Novel Explainable Out-of-Distribution Detection Approach for Spiking
Neural Networks [6.100274095771616]
This work presents a novel OoD detector that can identify whether test examples input to a Spiking Neural Network belong to the distribution of the data over which it was trained.
We characterize the internal activations of the hidden layers of the network in the form of spike count patterns.
A local explanation method is devised to produce attribution maps revealing which parts of the input instance push most towards the detection of an example as an OoD sample.
arXiv Detail & Related papers (2022-09-30T11:16:35Z) - Representation Learning for Content-Sensitive Anomaly Detection in
Industrial Networks [0.0]
This thesis proposes a framework to learn spatial-temporal aspects of raw network traffic in an unsupervised and protocol-agnostic manner.
The learned representations are used to measure the effect on the results of a subsequent anomaly detection.
arXiv Detail & Related papers (2022-04-20T09:22:41Z) - Adversarial Examples Detection with Bayesian Neural Network [57.185482121807716]
We propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors.
We propose a novel Bayesian adversarial example detector, short for BATer, to improve the performance of adversarial example detection.
arXiv Detail & Related papers (2021-05-18T15:51:24Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Toward Scalable and Unified Example-based Explanation and Outlier
Detection [128.23117182137418]
We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction.
We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.
arXiv Detail & Related papers (2020-11-11T05:58:17Z) - Transferable Perturbations of Deep Feature Distributions [102.94094966908916]
This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions.
We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models.
arXiv Detail & Related papers (2020-04-27T00:32:25Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - Analyzing the Noise Robustness of Deep Neural Networks [43.63911131982369]
Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions.
We present a visual analysis method to explain why adversarial examples are misclassified.
arXiv Detail & Related papers (2020-01-26T03:39:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.