Visual Analytics of Neuron Vulnerability to Adversarial Attacks on
Convolutional Neural Networks
- URL: http://arxiv.org/abs/2303.02814v1
- Date: Mon, 6 Mar 2023 01:01:56 GMT
- Title: Visual Analytics of Neuron Vulnerability to Adversarial Attacks on
Convolutional Neural Networks
- Authors: Yiran Li, Junpeng Wang, Takanori Fujiwara, Kwan-Liu Ma
- Abstract summary: Adversarial attacks on a convolutional neural network (CNN) could fool a high-performance CNN into making incorrect predictions.
Our work introduces a visual analytics approach to understanding adversarial attacks.
A visual analytics system is designed to incorporate visual reasoning for interpreting adversarial attacks.
- Score: 28.081328051535618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks on a convolutional neural network (CNN) -- injecting
human-imperceptible perturbations into an input image -- could fool a
high-performance CNN into making incorrect predictions. The success of
adversarial attacks raises serious concerns about the robustness of CNNs, and
prevents them from being used in safety-critical applications, such as medical
diagnosis and autonomous driving. Our work introduces a visual analytics
approach to understanding adversarial attacks by answering two questions: (1)
which neurons are more vulnerable to attacks and (2) which image features do
these vulnerable neurons capture during the prediction? For the first question,
we introduce multiple perturbation-based measures to break down the attacking
magnitude into individual CNN neurons and rank the neurons by their
vulnerability levels. For the second, we identify image features (e.g., cat
ears) that highly stimulate a user-selected neuron to augment and validate the
neuron's responsibility. Furthermore, we support an interactive exploration of
a large number of neurons by aiding with hierarchical clustering based on the
neurons' roles in the prediction. To this end, a visual analytics system is
designed to incorporate visual reasoning for interpreting adversarial attacks.
We validate the effectiveness of our system through multiple case studies as
well as feedback from domain experts.
Related papers
- Interpreting the Second-Order Effects of Neurons in CLIP [73.54377859089801]
We interpret the function of individual neurons in CLIP by automatically describing them using text.
We present the "second-order lens", analyzing the effect flowing from a neuron through the later attention heads, directly to the output.
Our results indicate that a scalable understanding of neurons can be used for model deception and for introducing new model capabilities.
arXiv Detail & Related papers (2024-06-06T17:59:52Z) - Identifying Interpretable Visual Features in Artificial and Biological
Neural Systems [3.604033202771937]
Single neurons in neural networks are often interpretable in that they represent individual, intuitively meaningful features.
Many neurons exhibit $textitmixed selectivity$, i.e., they represent multiple unrelated features.
We propose an automated method for quantifying visual interpretability and an approach for finding meaningful directions in network activation space.
arXiv Detail & Related papers (2023-10-17T17:41:28Z) - Investigating Human-Identifiable Features Hidden in Adversarial
Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets.
We identify human-identifiable features in adversarial perturbations.
Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z) - Searching for the Essence of Adversarial Perturbations [73.96215665913797]
We show that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's erroneous prediction.
This concept of human-recognizable information allows us to explain key features related to adversarial perturbations.
arXiv Detail & Related papers (2022-05-30T18:04:57Z) - Improving Adversarial Transferability via Neuron Attribution-Based
Attacks [35.02147088207232]
We propose the Neuron-based Attack (NAA), which conducts feature-level attacks with more accurate neuron importance estimations.
We derive an approximation scheme of neuron attribution to tremendously reduce the overhead.
Experiments confirm the superiority of our approach to the state-of-the-art benchmarks.
arXiv Detail & Related papers (2022-03-31T13:47:30Z) - Adversarial Robustness in Deep Learning: Attacks on Fragile Neurons [0.6899744489931016]
We identify fragile and robust neurons of deep learning architectures using nodal dropouts of the first convolutional layer.
We correlate these neurons with the distribution of adversarial attacks on the network.
arXiv Detail & Related papers (2022-01-31T14:34:07Z) - Adversarial Attacks on Spiking Convolutional Networks for Event-based
Vision [0.6999740786886537]
We show how white-box adversarial attack algorithms can be adapted to the discrete and sparse nature of event-based visual data.
We also verify, for the first time, the effectiveness of these perturbations directly on neuromorphic hardware.
arXiv Detail & Related papers (2021-10-06T17:20:05Z) - Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks.
This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy.
Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z) - BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by
Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks.
Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.