Searching for the Essence of Adversarial Perturbations
- URL: http://arxiv.org/abs/2205.15357v1
- Date: Mon, 30 May 2022 18:04:57 GMT
- Title: Searching for the Essence of Adversarial Perturbations
- Authors: Dennis Y. Menn and Hung-yi Lee
- Abstract summary: We show that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's erroneous prediction.
This concept of human-recognizable information allows us to explain key features related to adversarial perturbations.
- Score: 73.96215665913797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks have achieved the state-of-the-art performance on various
machine learning fields, yet the incorporation of malicious perturbations with
input data (adversarial example) is able to fool neural networks' predictions.
This would lead to potential risks in real-world applications, for example,
auto piloting and facial recognition. However, the reason for the existence of
adversarial examples remains controversial. Here we demonstrate that
adversarial perturbations contain human-recognizable information, which is the
key conspirator responsible for a neural network's erroneous prediction. This
concept of human-recognizable information allows us to explain key features
related to adversarial perturbations, which include the existence of
adversarial examples, the transferability among different neural networks, and
the increased neural network interpretability for adversarial training. Two
unique properties in adversarial perturbations that fool neural networks are
uncovered: masking and generation. A special class, the complementary class, is
identified when neural networks classify input images. The human-recognizable
information contained in adversarial perturbations allows researchers to gain
insight on the working principles of neural networks and may lead to develop
techniques that detect/defense adversarial attacks.
Related papers
- Towards unlocking the mystery of adversarial fragility of neural networks [6.589200529058999]
We look at the smallest magnitude of possible additive perturbations that can change the output of a classification algorithm.
We provide a matrix-theoretic explanation of the adversarial fragility of deep neural network for classification.
arXiv Detail & Related papers (2024-06-23T19:37:13Z) - A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions.
The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model.
This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z) - Investigating Human-Identifiable Features Hidden in Adversarial
Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets.
We identify human-identifiable features in adversarial perturbations.
Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z) - A reading survey on adversarial machine learning: Adversarial attacks
and their understanding [6.1678491628787455]
Adversarial Machine Learning exploits and understands some of the vulnerabilities that cause the neural networks to misclassify for near original input.
A class of algorithms called adversarial attacks is proposed to make the neural networks misclassify for various tasks in different domains.
This article provides a survey of existing adversarial attacks and their understanding based on different perspectives.
arXiv Detail & Related papers (2023-08-07T07:37:26Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Vulnerability Under Adversarial Machine Learning: Bias or Variance? [77.30759061082085]
We investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network.
Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation.
We introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies.
arXiv Detail & Related papers (2020-08-01T00:58:54Z) - Relationship between manifold smoothness and adversarial vulnerability
in deep learning with local errors [2.7834038784275403]
We study the origin of the adversarial vulnerability in artificial neural networks.
Our study reveals that a high generalization accuracy requires a relatively fast power-law decay of the eigen-spectrum of hidden representations.
arXiv Detail & Related papers (2020-07-04T08:47:51Z) - Bayesian Neural Networks [0.0]
We show how errors in prediction by neural networks can be obtained in principle, and provide the two favoured methods for characterising these errors.
We will also describe how both of these methods have substantial pitfalls when put into practice.
arXiv Detail & Related papers (2020-06-02T09:43:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.