How many perturbations break this model? Evaluating robustness beyond
adversarial accuracy
- URL: http://arxiv.org/abs/2207.04129v3
- Date: Thu, 10 Aug 2023 18:28:08 GMT
- Title: How many perturbations break this model? Evaluating robustness beyond
adversarial accuracy
- Authors: Raphael Olivier, Bhiksha Raj
- Abstract summary: We introduce adversarial sparsity, which quantifies how difficult it is to find a successful perturbation given both an input point and a constraint on the direction of the perturbation.
We show that sparsity provides valuable insight into neural networks in multiple ways.
- Score: 28.934863462633636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robustness to adversarial attacks is typically evaluated with adversarial
accuracy. While essential, this metric does not capture all aspects of
robustness and in particular leaves out the question of how many perturbations
can be found for each point. In this work, we introduce an alternative
approach, adversarial sparsity, which quantifies how difficult it is to find a
successful perturbation given both an input point and a constraint on the
direction of the perturbation. We show that sparsity provides valuable insight
into neural networks in multiple ways: for instance, it illustrates important
differences between current state-of-the-art robust models them that accuracy
analysis does not, and suggests approaches for improving their robustness. When
applying broken defenses effective against weak attacks but not strong ones,
sparsity can discriminate between the totally ineffective and the partially
effective defenses. Finally, with sparsity we can measure increases in
robustness that do not affect accuracy: we show for example that data
augmentation can by itself increase adversarial robustness, without using
adversarial training.
Related papers
- Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume [17.198794644483026]
We propose a new metric termed adversarial hypervolume, assessing the robustness of deep learning models comprehensively over a range of perturbation intensities.
We adopt a novel training algorithm that enhances adversarial robustness uniformly across various perturbation intensities.
This research contributes a new measure of robustness and establishes a standard for assessing benchmarking and the resilience of current and future defensive models against adversarial threats.
arXiv Detail & Related papers (2024-03-08T07:03:18Z) - Extreme Miscalibration and the Illusion of Adversarial Robustness [66.29268991629085]
Adversarial Training is often used to increase model robustness.
We show that this observed gain in robustness is an illusion of robustness (IOR)
We urge the NLP community to incorporate test-time temperature scaling into their robustness evaluations.
arXiv Detail & Related papers (2024-02-27T13:49:12Z) - Exploring Robust Features for Improving Adversarial Robustness [11.935612873688122]
We explore the robust features which are not affected by the adversarial perturbations to improve the model's adversarial robustness.
Specifically, we propose a feature disentanglement model to segregate the robust features from non-robust features and domain specific features.
The trained domain discriminator is able to identify the domain specific features from the clean images and adversarial examples almost perfectly.
arXiv Detail & Related papers (2023-09-09T00:30:04Z) - Revisiting DeepFool: generalization and improvement [17.714671419826715]
We introduce a new family of adversarial attacks that strike a balance between effectiveness and computational efficiency.
Our proposed attacks are also suitable for evaluating the robustness of large models.
arXiv Detail & Related papers (2023-03-22T11:49:35Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Masking Adversarial Damage: Finding Adversarial Saliency for Robust and
Sparse Network [33.18197518590706]
Adversarial examples provoke weak reliability and potential security issues in deep neural networks.
We propose a novel adversarial pruning method, Masking Adversarial Damage (MAD) that employs second-order information of adversarial loss.
We show that MAD effectively prunes adversarially trained networks without loosing adversarial robustness and shows better performance than previous adversarial pruning methods.
arXiv Detail & Related papers (2022-04-06T11:28:06Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Adversarial Robustness under Long-Tailed Distribution [93.50792075460336]
Adversarial robustness has attracted extensive studies recently by revealing the vulnerability and intrinsic characteristics of deep networks.
In this work we investigate the adversarial vulnerability as well as defense under long-tailed distributions.
We propose a clean yet effective framework, RoBal, which consists of two dedicated modules, a scale-invariant and data re-balancing.
arXiv Detail & Related papers (2021-04-06T17:53:08Z) - Geometry-aware Instance-reweighted Adversarial Training [78.70024866515756]
In adversarial machine learning, there was a common belief that robustness and accuracy hurt each other.
We propose geometry-aware instance-reweighted adversarial training, where the weights are based on how difficult it is to attack a natural data point.
Experiments show that our proposal boosts the robustness of standard adversarial training.
arXiv Detail & Related papers (2020-10-05T01:33:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.