Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles
- URL: http://arxiv.org/abs/2103.10229v1
- Date: Thu, 18 Mar 2021 13:04:21 GMT
- Title: Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles
- Authors: Gabriel D. Cantareira, Rodrigo F. Mello, Fernando V. Paulovich
- Abstract summary: This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
- Score: 69.9674326582747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As neural networks become the tool of choice to solve an increasing variety
of problems in our society, adversarial attacks become critical. The
possibility of generating data instances deliberately designed to fool a
network's analysis can have disastrous consequences. Recent work has shown that
commonly used methods for model training often result in fragile abstract
representations that are particularly vulnerable to such attacks. This paper
presents a visual framework to investigate neural network models subjected to
adversarial examples, revealing how models' perception of the adversarial data
differs from regular data instances and their relationships with class
perception. Through different use cases, we show how observing these elements
can quickly pinpoint exploited areas in a model, allowing further study of
vulnerable features in input data and serving as a guide to improving model
training and architecture.
Related papers
- Transpose Attack: Stealing Datasets with Bidirectional Training [4.166238443183223]
We show that adversaries can exfiltrate datasets from protected learning environments under the guise of legitimate models.
We propose a novel approach for detecting infected models.
arXiv Detail & Related papers (2023-11-13T15:14:50Z) - A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions.
The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model.
This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z) - Investigating Human-Identifiable Features Hidden in Adversarial
Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets.
We identify human-identifiable features in adversarial perturbations.
Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z) - Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain.
In this work, we build models that rely on the message-passing rule of predictive coding.
We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z) - Interactive Analysis of CNN Robustness [11.136837582678869]
Perturber is a web-based application that allows users to explore how CNN activations and predictions evolve when a 3D input scene is interactively perturbed.
Perturber offers a large variety of scene modifications, such as camera controls, lighting and shading effects, background modifications, object morphing, as well as adversarial attacks.
Case studies with machine learning experts have shown that Perturber helps users to quickly generate hypotheses about model vulnerabilities and to qualitatively compare model behavior.
arXiv Detail & Related papers (2021-10-14T18:52:39Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Detection Defense Against Adversarial Attacks with Saliency Map [7.736844355705379]
It is well established that neural networks are vulnerable to adversarial examples, which are almost imperceptible on human vision.
Existing defenses are trend to harden the robustness of models against adversarial attacks.
We propose a novel method combined with additional noises and utilize the inconsistency strategy to detect adversarial examples.
arXiv Detail & Related papers (2020-09-06T13:57:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.