Related papers: Exploring Adversarial Examples and Adversarial Robustness of Convolutional Neural Networks by Mutual Information

Exploring Adversarial Examples and Adversarial Robustness of Convolutional Neural Networks by Mutual Information

URL: http://arxiv.org/abs/2207.05756v1
Date: Tue, 12 Jul 2022 13:25:42 GMT
Title: Exploring Adversarial Examples and Adversarial Robustness of Convolutional Neural Networks by Mutual Information
Authors: Jiebao Zhang, Wenhua Qian, Rencan Nie, Jinde Cao, Dan Xu
Abstract summary: This work investigates similarities and differences between two types of convolutional neural networks (CNNs) in information extraction. The reason why adversarial examples mislead CNNs may be that they contain more texture-based information about other categories. Normally trained CNNs tend to extract texture-based information from the inputs, while adversarially trained models prefer to shape-based information.
Score: 44.841339443764696
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A counter-intuitive property of convolutional neural networks (CNNs) is their inherent susceptibility to adversarial examples, which severely hinders the application of CNNs in security-critical fields. Adversarial examples are similar to original examples but contain malicious perturbations. Adversarial training is a simple and effective training method to improve the robustness of CNNs to adversarial examples. The mechanisms behind adversarial examples and adversarial training are worth exploring. Therefore, this work investigates similarities and differences between two types of CNNs (both normal and robust ones) in information extraction by observing the trends towards the mutual information. We show that 1) the amount of mutual information that CNNs extract from original and adversarial examples is almost similar, whether CNNs are in normal training or adversarial training; the reason why adversarial examples mislead CNNs may be that they contain more texture-based information about other categories; 2) compared with normal training, adversarial training is more difficult and the amount of information extracted by the robust CNNs is less; 3) the CNNs trained with different methods have different preferences for certain types of information; normally trained CNNs tend to extract texture-based information from the inputs, while adversarially trained models prefer to shape-based information. Furthermore, we also analyze the mutual information estimators used in this work, kernel-density-estimation and binning methods, and find that these estimators outline the geometric properties of the middle layer's output to a certain extent.

Related papers

Unfolding Local Growth Rate Estimates for (Almost) Perfect Adversarial Detection [22.99930028876662]
Convolutional neural networks (CNN) define the state-of-the-art solution on many perceptual tasks. Current CNN approaches largely remain vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system. We propose a simple and light-weight detector, which leverages recent findings on the relation between networks' local intrinsic dimensionality (LID) and adversarial attacks.
arXiv Detail & Related papers (2022-12-13T17:51:32Z)
A novel feature-scrambling approach reveals the capacity of convolutional neural networks to learn spatial relations [0.0]
Convolutional neural networks (CNNs) are one of the most successful computer vision systems to solve object recognition. Yet it remains poorly understood how CNNs actually make their decisions, what the nature of their internal representations is, and how their recognition strategies differ from humans.
arXiv Detail & Related papers (2022-12-12T16:40:29Z)
Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks. This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy. Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
Informative Dropout for Robust Representation Learning: A Shape-bias Perspective [84.30946377024297]
We propose a light-weight model-agnostic method, namely Informative Dropout (InfoDrop), to improve interpretability and reduce texture bias. Specifically, we discriminate texture from shape based on local self-information in an image, and adopt a Dropout-like algorithm to decorrelate the model output from the local texture.
arXiv Detail & Related papers (2020-08-10T16:52:24Z)
The shape and simplicity biases of adversarially robust ImageNet-trained CNNs [9.707679445925516]
We study the shape bias and internal mechanisms that enable the generalizability of AlexNet, GoogLeNet, and ResNet-50 models trained via adversarial training. Remarkably, adversarial training induces three simplicity biases into hidden neurons in the process of "robustifying" CNNs.
arXiv Detail & Related papers (2020-06-16T16:38:16Z)
An Information-theoretic Visual Analysis Framework for Convolutional Neural Networks [11.15523311079383]
We introduce a data model to organize the data that can be extracted from CNN models. We then propose two ways to calculate entropy under different circumstances. We develop a visual analysis system, CNNSlicer, to interactively explore the amount of information changes inside the model.
arXiv Detail & Related papers (2020-05-02T21:36:50Z)
Transferable Perturbations of Deep Feature Distributions [102.94094966908916]
This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions. We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models.
arXiv Detail & Related papers (2020-04-27T00:32:25Z)
Hold me tight! Influence of discriminative features on deep network boundaries [63.627760598441796]
We propose a new perspective that relates dataset features to the distance of samples to the decision boundary. This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets.
arXiv Detail & Related papers (2020-02-15T09:29:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.