Related papers: Visual Analytics of Neuron Vulnerability to Adversarial Attacks on Convolutional Neural Networks

Visual Analytics of Neuron Vulnerability to Adversarial Attacks on Convolutional Neural Networks

URL: http://arxiv.org/abs/2303.02814v1
Date: Mon, 6 Mar 2023 01:01:56 GMT
Title: Visual Analytics of Neuron Vulnerability to Adversarial Attacks on Convolutional Neural Networks
Authors: Yiran Li, Junpeng Wang, Takanori Fujiwara, Kwan-Liu Ma
Abstract summary: Adversarial attacks on a convolutional neural network (CNN) could fool a high-performance CNN into making incorrect predictions. Our work introduces a visual analytics approach to understanding adversarial attacks. A visual analytics system is designed to incorporate visual reasoning for interpreting adversarial attacks.
Score: 28.081328051535618
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial attacks on a convolutional neural network (CNN) -- injecting human-imperceptible perturbations into an input image -- could fool a high-performance CNN into making incorrect predictions. The success of adversarial attacks raises serious concerns about the robustness of CNNs, and prevents them from being used in safety-critical applications, such as medical diagnosis and autonomous driving. Our work introduces a visual analytics approach to understanding adversarial attacks by answering two questions: (1) which neurons are more vulnerable to attacks and (2) which image features do these vulnerable neurons capture during the prediction? For the first question, we introduce multiple perturbation-based measures to break down the attacking magnitude into individual CNN neurons and rank the neurons by their vulnerability levels. For the second, we identify image features (e.g., cat ears) that highly stimulate a user-selected neuron to augment and validate the neuron's responsibility. Furthermore, we support an interactive exploration of a large number of neurons by aiding with hierarchical clustering based on the neurons' roles in the prediction. To this end, a visual analytics system is designed to incorporate visual reasoning for interpreting adversarial attacks. We validate the effectiveness of our system through multiple case studies as well as feedback from domain experts.

Related papers

NCCR: to Evaluate the Robustness of Neural Networks and Adversarial Examples [11.038285179317114]
In this work, we propose a metric called the neuron cover change rate (NCCR) to measure the ability of deep learning models to resist attacks.<n>NCCR monitors alterations in the output of specifically chosen neurons when the input is perturbed.<n>The results of the experiment on image recognition and the speaker recognition model show that our metrics can provide a good assessment of the robustness of neural networks or their inputs.
arXiv Detail & Related papers (2025-07-29T03:58:20Z)
Interpreting the Second-Order Effects of Neurons in CLIP [73.54377859089801]
We interpret the function of individual neurons in CLIP by automatically describing them using text. We present the "second-order lens", analyzing the effect flowing from a neuron through the later attention heads, directly to the output. Our results indicate that a scalable understanding of neurons can be used for model deception and for introducing new model capabilities.
arXiv Detail & Related papers (2024-06-06T17:59:52Z)
Identifying Interpretable Visual Features in Artificial and Biological Neural Systems [3.604033202771937]
Single neurons in neural networks are often interpretable in that they represent individual, intuitively meaningful features. Many neurons exhibit $textitmixed selectivity$, i.e., they represent multiple unrelated features. We propose an automated method for quantifying visual interpretability and an approach for finding meaningful directions in network activation space.
arXiv Detail & Related papers (2023-10-17T17:41:28Z)
Investigating Human-Identifiable Features Hidden in Adversarial Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets. We identify human-identifiable features in adversarial perturbations. Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z)
Searching for the Essence of Adversarial Perturbations [73.96215665913797]
We show that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's erroneous prediction. This concept of human-recognizable information allows us to explain key features related to adversarial perturbations.
arXiv Detail & Related papers (2022-05-30T18:04:57Z)
Improving Adversarial Transferability via Neuron Attribution-Based Attacks [35.02147088207232]
We propose the Neuron-based Attack (NAA), which conducts feature-level attacks with more accurate neuron importance estimations. We derive an approximation scheme of neuron attribution to tremendously reduce the overhead. Experiments confirm the superiority of our approach to the state-of-the-art benchmarks.
arXiv Detail & Related papers (2022-03-31T13:47:30Z)
Adversarial Robustness in Deep Learning: Attacks on Fragile Neurons [0.6899744489931016]
We identify fragile and robust neurons of deep learning architectures using nodal dropouts of the first convolutional layer. We correlate these neurons with the distribution of adversarial attacks on the network.
arXiv Detail & Related papers (2022-01-31T14:34:07Z)
Adversarial Attacks on Spiking Convolutional Networks for Event-based Vision [0.6999740786886537]
We show how white-box adversarial attack algorithms can be adapted to the discrete and sparse nature of event-based visual data. We also verify, for the first time, the effectiveness of these perturbations directly on neuromorphic hardware.
arXiv Detail & Related papers (2021-10-06T17:20:05Z)
Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks. This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy. Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks. We define AND-like neurons and propose measures to increase their proportion in the network. Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.