Exploring the role of Input and Output Layers of a Deep Neural Network
in Adversarial Defense
- URL: http://arxiv.org/abs/2006.01408v1
- Date: Tue, 2 Jun 2020 06:15:46 GMT
- Title: Exploring the role of Input and Output Layers of a Deep Neural Network
in Adversarial Defense
- Authors: Jay N. Paranjape, Rahul Kumar Dubey, Vijendran V Gopalan
- Abstract summary: It has been shown that certain inputs exist which would not trick a human normally, but may mislead the model completely.
adversarial inputs pose a high security threat when such models are used in real world applications.
We have analyzed the resistance of three different classes of fully connected dense networks against the rarely tested non-gradient based adversarial attacks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks are learning models having achieved state of the art
performance in many fields like prediction, computer vision, language
processing and so on. However, it has been shown that certain inputs exist
which would not trick a human normally, but may mislead the model completely.
These inputs are known as adversarial inputs. These inputs pose a high security
threat when such models are used in real world applications. In this work, we
have analyzed the resistance of three different classes of fully connected
dense networks against the rarely tested non-gradient based adversarial
attacks. These classes are created by manipulating the input and output layers.
We have proven empirically that owing to certain characteristics of the
network, they provide a high robustness against these attacks, and can be used
in fine tuning other models to increase defense against adversarial attacks.
Related papers
- Adversarial Attacks and Dimensionality in Text Classifiers [3.4179091429029382]
Adversarial attacks on machine learning algorithms have been a key deterrent to the adoption of AI in many real-world use cases.
We study adversarial examples in the field of natural language processing, specifically text classification tasks.
arXiv Detail & Related papers (2024-04-03T11:49:43Z) - Understanding Deep Learning defenses Against Adversarial Examples
Through Visualizations for Dynamic Risk Assessment [0.0]
Adversarial training, dimensionality reduction and prediction similarity were selected as defenses against adversarial example attack.
In each defense, the behavior of the original model has been compared with the behavior of the defended model, representing the target model by a graph in a visualization.
arXiv Detail & Related papers (2024-02-12T09:05:01Z) - Investigating Human-Identifiable Features Hidden in Adversarial
Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets.
We identify human-identifiable features in adversarial perturbations.
Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z) - Can Adversarial Examples Be Parsed to Reveal Victim Model Information? [62.814751479749695]
In this work, we ask whether it is possible to infer data-agnostic victim model (VM) information from data-specific adversarial instances.
We collect a dataset of adversarial attacks across 7 attack types generated from 135 victim models.
We show that a simple, supervised model parsing network (MPN) is able to infer VM attributes from unseen adversarial attacks.
arXiv Detail & Related papers (2023-03-13T21:21:49Z) - Check Your Other Door! Establishing Backdoor Attacks in the Frequency
Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks.
We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z) - Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one.
We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models.
Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z) - BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by
Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks.
Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z) - Adversarial Attacks on Deep Learning Based Power Allocation in a Massive
MIMO Network [62.77129284830945]
We show that adversarial attacks can break DL-based power allocation in the downlink of a massive multiple-input-multiple-output (maMIMO) network.
We benchmark the performance of these attacks and show that with a small perturbation in the input of the neural network (NN), the white-box attacks can result in infeasible solutions up to 86%.
arXiv Detail & Related papers (2021-01-28T16:18:19Z) - An Empirical Review of Adversarial Defenses [0.913755431537592]
Deep neural networks, which form the basis of such systems, are highly susceptible to a specific type of attack, called adversarial attacks.
A hacker can, even with bare minimum computation, generate adversarial examples (images or data points that belong to another class, but consistently fool the model to get misclassified as genuine) and crumble the basis of such algorithms.
We show two effective techniques, namely Dropout and Denoising Autoencoders, and show their success in preventing such attacks from fooling the model.
arXiv Detail & Related papers (2020-12-10T09:34:41Z) - Bluff: Interactively Deciphering Adversarial Attacks on Deep Neural
Networks [21.074988013822566]
Bluff is an interactive system for visualizing, characterizing, and deciphering adversarial attacks on vision-based neural networks.
It reveals mechanisms that adversarial attacks employ to inflict harm on a model.
arXiv Detail & Related papers (2020-09-05T22:08:35Z) - Adversarial Feature Desensitization [12.401175943131268]
We propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field.
Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs.
arXiv Detail & Related papers (2020-06-08T14:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.