Verifying the Causes of Adversarial Examples
- URL: http://arxiv.org/abs/2010.09633v1
- Date: Mon, 19 Oct 2020 16:17:20 GMT
- Title: Verifying the Causes of Adversarial Examples
- Authors: Honglin Li, Yifei Fan, Frieder Ganz, Anthony Yezzi, Payam Barnaghi
- Abstract summary: The robustness of neural networks is challenged by adversarial examples that contain almost imperceptible perturbations to inputs.
We present a collection of potential causes of adversarial examples and verify (or partially verify) them through carefully-designed controlled experiments.
Our experiment results show that geometric factors tend to be more direct causes and statistical factors magnify the phenomenon.
- Score: 5.381050729919025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The robustness of neural networks is challenged by adversarial examples that
contain almost imperceptible perturbations to inputs, which mislead a
classifier to incorrect outputs in high confidence. Limited by the extreme
difficulty in examining a high-dimensional image space thoroughly, research on
explaining and justifying the causes of adversarial examples falls behind
studies on attacks and defenses. In this paper, we present a collection of
potential causes of adversarial examples and verify (or partially verify) them
through carefully-designed controlled experiments. The major causes of
adversarial examples include model linearity, one-sum constraint, and geometry
of the categories. To control the effect of those causes, multiple techniques
are applied such as $L_2$ normalization, replacement of loss functions,
construction of reference datasets, and novel models using multi-layer
perceptron probabilistic neural networks (MLP-PNN) and density estimation (DE).
Our experiment results show that geometric factors tend to be more direct
causes and statistical factors magnify the phenomenon, especially for assigning
high prediction confidence. We believe this paper will inspire more studies to
rigorously investigate the root causes of adversarial examples, which in turn
provide useful guidance on designing more robust models.
Related papers
- Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for Interpreting Neural Networks [14.407025310553225]
Interpretability research takes counterfactual theories of causality for granted.
Counterfactual theories have problems that bias our findings in specific and predictable ways.
We discuss the implications of these challenges for interpretability researchers.
arXiv Detail & Related papers (2024-07-05T17:53:03Z) - A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions.
The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model.
This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z) - Demystifying Causal Features on Adversarial Examples and Causal
Inoculation for Robust Network by Adversarial Instrumental Variable
Regression [32.727673706238086]
We propose a way of delving into the unexpected vulnerability in adversarially trained networks from a causal perspective.
By deploying it, we estimate the causal relation of adversarial prediction under an unbiased environment.
We demonstrate that the estimated causal features are highly related to the correct prediction for adversarial robustness.
arXiv Detail & Related papers (2023-03-02T08:18:22Z) - Causal Triplet: An Open Challenge for Intervention-centric Causal
Representation Learning [98.78136504619539]
Causal Triplet is a causal representation learning benchmark featuring visually more complex scenes.
We show that models built with the knowledge of disentangled or object-centric representations significantly outperform their distributed counterparts.
arXiv Detail & Related papers (2023-01-12T17:43:38Z) - Quantify the Causes of Causal Emergence: Critical Conditions of
Uncertainty and Asymmetry in Causal Structure [0.5372002358734439]
Investigation of causal relationships based on statistical and informational theories have posed an interesting and valuable challenge to large-scale models.
This paper introduces a framework for assessing numerical conditions of Causal Emergence as theoretical constraints of its occurrence.
arXiv Detail & Related papers (2022-12-03T06:35:54Z) - Generalizable Information Theoretic Causal Representation [37.54158138447033]
We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph.
The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
arXiv Detail & Related papers (2022-02-17T00:38:35Z) - A Frequency Perspective of Adversarial Robustness [72.48178241090149]
We present a frequency-based understanding of adversarial examples, supported by theoretical and empirical findings.
Our analysis shows that adversarial examples are neither in high-frequency nor in low-frequency components, but are simply dataset dependent.
We propose a frequency-based explanation for the commonly observed accuracy vs. robustness trade-off.
arXiv Detail & Related papers (2021-10-26T19:12:34Z) - Pruning in the Face of Adversaries [0.0]
We evaluate the impact of neural network pruning on the adversarial robustness against L-0, L-2 and L-infinity attacks.
Our results confirm that neural network pruning and adversarial robustness are not mutually exclusive.
We extend our analysis to situations that incorporate additional assumptions on the adversarial scenario and show that depending on the situation, different strategies are optimal.
arXiv Detail & Related papers (2021-08-19T09:06:16Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - ACRE: Abstract Causal REasoning Beyond Covariation [90.99059920286484]
We introduce the Abstract Causal REasoning dataset for systematic evaluation of current vision systems in causal induction.
Motivated by the stream of research on causal discovery in Blicket experiments, we query a visual reasoning system with the following four types of questions in either an independent scenario or an interventional scenario.
We notice that pure neural models tend towards an associative strategy under their chance-level performance, whereas neuro-symbolic combinations struggle in backward-blocking reasoning.
arXiv Detail & Related papers (2021-03-26T02:42:38Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.