Exploring Adversarial Attacks on Neural Networks: An Explainable
Approach
- URL: http://arxiv.org/abs/2303.06032v1
- Date: Wed, 8 Mar 2023 07:59:44 GMT
- Title: Exploring Adversarial Attacks on Neural Networks: An Explainable
Approach
- Authors: Justus Renkhoff, Wenkai Tan, Alvaro Velasquez, illiam Yichen Wang,
Yongxin Liu, Jian Wang, Shuteng Niu, Lejla Begic Fazlic, Guido Dartmann,
Houbing Song
- Abstract summary: We analyze the response characteristics of the VGG-16 model when the input images are mixed with adversarial noise and statistically similar Gaussian random noise.
Our work could provide valuable insights into developing more reliable Deep Neural Network (DNN) models.
- Score: 18.063187159491182
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Deep Learning (DL) is being applied in various domains, especially in
safety-critical applications such as autonomous driving. Consequently, it is of
great significance to ensure the robustness of these methods and thus
counteract uncertain behaviors caused by adversarial attacks. In this paper, we
use gradient heatmaps to analyze the response characteristics of the VGG-16
model when the input images are mixed with adversarial noise and statistically
similar Gaussian random noise. In particular, we compare the network response
layer by layer to determine where errors occurred. Several interesting findings
are derived. First, compared to Gaussian random noise, intentionally generated
adversarial noise causes severe behavior deviation by distracting the area of
concentration in the networks. Second, in many cases, adversarial examples only
need to compromise a few intermediate blocks to mislead the final decision.
Third, our experiments revealed that specific blocks are more vulnerable and
easier to exploit by adversarial examples. Finally, we demonstrate that the
layers $Block4\_conv1$ and $Block5\_cov1$ of the VGG-16 model are more
susceptible to adversarial attacks. Our work could provide valuable insights
into developing more reliable Deep Neural Network (DNN) models.
Related papers
- Detecting Adversarial Attacks in Semantic Segmentation via Uncertainty Estimation: A Deep Analysis [12.133306321357999]
We propose an uncertainty-based method for detecting adversarial attacks on neural networks for semantic segmentation.
We conduct a detailed analysis of uncertainty-based detection of adversarial attacks and various state-of-the-art neural networks.
Our numerical experiments show the effectiveness of the proposed uncertainty-based detection method.
arXiv Detail & Related papers (2024-08-19T14:13:30Z) - Wasserstein distributional robustness of neural networks [9.79503506460041]
Deep neural networks are known to be vulnerable to adversarial attacks (AA)
For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified.
We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions.
arXiv Detail & Related papers (2023-06-16T13:41:24Z) - Spatial-Frequency Discriminability for Revealing Adversarial Perturbations [53.279716307171604]
Vulnerability of deep neural networks to adversarial perturbations has been widely perceived in the computer vision community.
Current algorithms typically detect adversarial patterns through discriminative decomposition for natural and adversarial data.
We propose a discriminative detector relying on a spatial-frequency Krawtchouk decomposition.
arXiv Detail & Related papers (2023-05-18T10:18:59Z) - NoiseCAM: Explainable AI for the Boundary Between Noise and Adversarial
Attacks [21.86821880164293]
adversarial attacks can easily mislead a neural network and lead to wrong decisions.
In this paper, we use the gradient class activation map (GradCAM) to analyze the behavior deviation of the VGG-16 network.
We also propose a novel NoiseCAM algorithm that integrates information from globally and pixel-level weighted class activation maps.
arXiv Detail & Related papers (2023-03-09T22:07:41Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - A Frequency Perspective of Adversarial Robustness [72.48178241090149]
We present a frequency-based understanding of adversarial examples, supported by theoretical and empirical findings.
Our analysis shows that adversarial examples are neither in high-frequency nor in low-frequency components, but are simply dataset dependent.
We propose a frequency-based explanation for the commonly observed accuracy vs. robustness trade-off.
arXiv Detail & Related papers (2021-10-26T19:12:34Z) - Pruning in the Face of Adversaries [0.0]
We evaluate the impact of neural network pruning on the adversarial robustness against L-0, L-2 and L-infinity attacks.
Our results confirm that neural network pruning and adversarial robustness are not mutually exclusive.
We extend our analysis to situations that incorporate additional assumptions on the adversarial scenario and show that depending on the situation, different strategies are optimal.
arXiv Detail & Related papers (2021-08-19T09:06:16Z) - On Procedural Adversarial Noise Attack And Defense [2.5388455804357952]
adversarial examples would inveigle neural networks to make prediction errors with small per- turbations on the input images.
In this paper, we propose two universal adversarial perturbation (UAP) generation methods based on procedural noise functions.
Without changing the semantic representations, the adversarial examples generated via our methods show superior performance on the attack.
arXiv Detail & Related papers (2021-08-10T02:47:01Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - How benign is benign overfitting? [96.07549886487526]
We investigate two causes for adversarial vulnerability in deep neural networks: bad data and (poorly) trained models.
Deep neural networks essentially achieve zero training error, even in the presence of label noise.
We identify label noise as one of the causes for adversarial vulnerability.
arXiv Detail & Related papers (2020-07-08T11:07:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.