On the (Un-)Avoidability of Adversarial Examples
- URL: http://arxiv.org/abs/2106.13326v1
- Date: Thu, 24 Jun 2021 21:35:25 GMT
- Title: On the (Un-)Avoidability of Adversarial Examples
- Authors: Sadia Chowdhury and Ruth Urner
- Abstract summary: adversarial examples in deep learning models have caused substantial concern over their reliability.
We provide a framework for determining whether a model's label change under small perturbation is justified.
We prove that our adaptive data-augmentation maintains consistency of 1-nearest neighbor classification under deterministic labels.
- Score: 4.822598110892847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The phenomenon of adversarial examples in deep learning models has caused
substantial concern over their reliability. While many deep neural networks
have shown impressive performance in terms of predictive accuracy, it has been
shown that in many instances an imperceptible perturbation can falsely flip the
network's prediction. Most research has then focused on developing defenses
against adversarial attacks or learning under a worst-case adversarial loss. In
this work, we take a step back and aim to provide a framework for determining
whether a model's label change under small perturbation is justified (and when
it is not). We carefully argue that adversarial robustness should be defined as
a locally adaptive measure complying with the underlying distribution. We then
suggest a definition for an adaptive robust loss, derive an empirical version
of it, and develop a resulting data-augmentation framework. We prove that our
adaptive data-augmentation maintains consistency of 1-nearest neighbor
classification under deterministic labels and provide illustrative empirical
evaluations.
Related papers
- Demystifying Causal Features on Adversarial Examples and Causal
Inoculation for Robust Network by Adversarial Instrumental Variable
Regression [32.727673706238086]
We propose a way of delving into the unexpected vulnerability in adversarially trained networks from a causal perspective.
By deploying it, we estimate the causal relation of adversarial prediction under an unbiased environment.
We demonstrate that the estimated causal features are highly related to the correct prediction for adversarial robustness.
arXiv Detail & Related papers (2023-03-02T08:18:22Z) - Addressing Mistake Severity in Neural Networks with Semantic Knowledge [0.0]
Most robust training techniques aim to improve model accuracy on perturbed inputs.
As an alternate form of robustness, we aim to reduce the severity of mistakes made by neural networks in challenging conditions.
We leverage current adversarial training methods to generate targeted adversarial attacks during the training process.
Results demonstrate that our approach performs better with respect to mistake severity compared to standard and adversarially trained models.
arXiv Detail & Related papers (2022-11-21T22:01:36Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Adversarial Robustness through the Lens of Causality [105.51753064807014]
adversarial vulnerability of deep neural networks has attracted significant attention in machine learning.
We propose to incorporate causality into mitigating adversarial vulnerability.
Our method can be seen as the first attempt to leverage causality for mitigating adversarial vulnerability.
arXiv Detail & Related papers (2021-06-11T06:55:02Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Improving Calibration through the Relationship with Adversarial
Robustness [19.384119330332446]
We study the connection between adversarial robustness and calibration.
We propose Adversarial Robustness based Adaptive Labeling (AR-AdaLS)
We find that our method, taking the adversarial robustness of the in-distribution data into consideration, leads to better calibration over the model even under distributional shifts.
arXiv Detail & Related papers (2020-06-29T20:56:33Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z) - Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial
Perturbations [65.05561023880351]
Adversarial examples are malicious inputs crafted to induce misclassification.
This paper studies a complementary failure mode, invariance-based adversarial examples.
We show that defenses against sensitivity-based attacks actively harm a model's accuracy on invariance-based attacks.
arXiv Detail & Related papers (2020-02-11T18:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.