Exploring Adversarial Examples and Adversarial Robustness of
Convolutional Neural Networks by Mutual Information
- URL: http://arxiv.org/abs/2207.05756v1
- Date: Tue, 12 Jul 2022 13:25:42 GMT
- Title: Exploring Adversarial Examples and Adversarial Robustness of
Convolutional Neural Networks by Mutual Information
- Authors: Jiebao Zhang, Wenhua Qian, Rencan Nie, Jinde Cao, Dan Xu
- Abstract summary: This work investigates similarities and differences between two types of convolutional neural networks (CNNs) in information extraction.
The reason why adversarial examples mislead CNNs may be that they contain more texture-based information about other categories.
Normally trained CNNs tend to extract texture-based information from the inputs, while adversarially trained models prefer to shape-based information.
- Score: 44.841339443764696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A counter-intuitive property of convolutional neural networks (CNNs) is their
inherent susceptibility to adversarial examples, which severely hinders the
application of CNNs in security-critical fields. Adversarial examples are
similar to original examples but contain malicious perturbations. Adversarial
training is a simple and effective training method to improve the robustness of
CNNs to adversarial examples. The mechanisms behind adversarial examples and
adversarial training are worth exploring. Therefore, this work investigates
similarities and differences between two types of CNNs (both normal and robust
ones) in information extraction by observing the trends towards the mutual
information. We show that 1) the amount of mutual information that CNNs extract
from original and adversarial examples is almost similar, whether CNNs are in
normal training or adversarial training; the reason why adversarial examples
mislead CNNs may be that they contain more texture-based information about
other categories; 2) compared with normal training, adversarial training is
more difficult and the amount of information extracted by the robust CNNs is
less; 3) the CNNs trained with different methods have different preferences for
certain types of information; normally trained CNNs tend to extract
texture-based information from the inputs, while adversarially trained models
prefer to shape-based information. Furthermore, we also analyze the mutual
information estimators used in this work, kernel-density-estimation and binning
methods, and find that these estimators outline the geometric properties of the
middle layer's output to a certain extent.
Related papers
- Unfolding Local Growth Rate Estimates for (Almost) Perfect Adversarial
Detection [22.99930028876662]
Convolutional neural networks (CNN) define the state-of-the-art solution on many perceptual tasks.
Current CNN approaches largely remain vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system.
We propose a simple and light-weight detector, which leverages recent findings on the relation between networks' local intrinsic dimensionality (LID) and adversarial attacks.
arXiv Detail & Related papers (2022-12-13T17:51:32Z) - A novel feature-scrambling approach reveals the capacity of
convolutional neural networks to learn spatial relations [0.0]
Convolutional neural networks (CNNs) are one of the most successful computer vision systems to solve object recognition.
Yet it remains poorly understood how CNNs actually make their decisions, what the nature of their internal representations is, and how their recognition strategies differ from humans.
arXiv Detail & Related papers (2022-12-12T16:40:29Z) - Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks.
This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy.
Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z) - BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by
Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks.
Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z) - Informative Dropout for Robust Representation Learning: A Shape-bias
Perspective [84.30946377024297]
We propose a light-weight model-agnostic method, namely Informative Dropout (InfoDrop), to improve interpretability and reduce texture bias.
Specifically, we discriminate texture from shape based on local self-information in an image, and adopt a Dropout-like algorithm to decorrelate the model output from the local texture.
arXiv Detail & Related papers (2020-08-10T16:52:24Z) - The shape and simplicity biases of adversarially robust ImageNet-trained
CNNs [9.707679445925516]
We study the shape bias and internal mechanisms that enable the generalizability of AlexNet, GoogLeNet, and ResNet-50 models trained via adversarial training.
Remarkably, adversarial training induces three simplicity biases into hidden neurons in the process of "robustifying" CNNs.
arXiv Detail & Related papers (2020-06-16T16:38:16Z) - An Information-theoretic Visual Analysis Framework for Convolutional
Neural Networks [11.15523311079383]
We introduce a data model to organize the data that can be extracted from CNN models.
We then propose two ways to calculate entropy under different circumstances.
We develop a visual analysis system, CNNSlicer, to interactively explore the amount of information changes inside the model.
arXiv Detail & Related papers (2020-05-02T21:36:50Z) - Transferable Perturbations of Deep Feature Distributions [102.94094966908916]
This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions.
We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models.
arXiv Detail & Related papers (2020-04-27T00:32:25Z) - Hold me tight! Influence of discriminative features on deep network
boundaries [63.627760598441796]
We propose a new perspective that relates dataset features to the distance of samples to the decision boundary.
This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets.
arXiv Detail & Related papers (2020-02-15T09:29:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.