Rethinking Empirical Evaluation of Adversarial Robustness Using
First-Order Attack Methods
- URL: http://arxiv.org/abs/2006.01304v1
- Date: Mon, 1 Jun 2020 22:55:09 GMT
- Title: Rethinking Empirical Evaluation of Adversarial Robustness Using
First-Order Attack Methods
- Authors: Kyungmi Lee, Anantha P. Chandrakasan
- Abstract summary: We identify three common cases that lead to overestimation of adversarial accuracy against bounded first-order attack methods.
We propose compensation methods that address sources of inaccurate gradient computation.
Overall, our work shows that overestimated adversarial accuracy that is not indicative of robustness is prevalent even for conventionally trained deep neural networks.
- Score: 6.531546527140473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We identify three common cases that lead to overestimation of adversarial
accuracy against bounded first-order attack methods, which is popularly used as
a proxy for adversarial robustness in empirical studies. For each case, we
propose compensation methods that either address sources of inaccurate gradient
computation, such as numerical instability near zero and non-differentiability,
or reduce the total number of back-propagations for iterative attacks by
approximating second-order information. These compensation methods can be
combined with existing attack methods for a more precise empirical evaluation
metric. We illustrate the impact of these three cases with examples of
practical interest, such as benchmarking model capacity and regularization
techniques for robustness. Overall, our work shows that overestimated
adversarial accuracy that is not indicative of robustness is prevalent even for
conventionally trained deep neural networks, and highlights cautions of using
empirical evaluation without guaranteed bounds.
Related papers
- A practical approach to evaluating the adversarial distance for machine learning classifiers [2.2120851074630177]
This paper investigates the estimation of the more informative adversarial distance using iterative adversarial attacks and a certification approach.
We find that our adversarial attack approach is effective compared to related implementations, while the certification method falls short of expectations.
arXiv Detail & Related papers (2024-09-05T14:57:01Z) - Group-based Robustness: A General Framework for Customized Robustness in
the Real World [16.376584375681812]
We find that conventional metrics measuring targeted and untargeted robustness do not appropriately reflect a model's ability to withstand attacks from one set of source classes to another set of target classes.
We propose a new metric, termed group-based robustness, that complements existing metrics and is better-suited for evaluating model performance in certain attack scenarios.
We show that with comparable success rates, finding evasive samples using our new loss functions saves by a factor as large as the number of targeted classes.
arXiv Detail & Related papers (2023-06-29T01:07:12Z) - On Practical Aspects of Aggregation Defenses against Data Poisoning
Attacks [58.718697580177356]
Attacks on deep learning models with malicious training samples are known as data poisoning.
Recent advances in defense strategies against data poisoning have highlighted the effectiveness of aggregation schemes in achieving certified poisoning robustness.
Here we focus on Deep Partition Aggregation, a representative aggregation defense, and assess its practical aspects, including efficiency, performance, and robustness.
arXiv Detail & Related papers (2023-06-28T17:59:35Z) - Adversarial Training Should Be Cast as a Non-Zero-Sum Game [121.95628660889628]
Two-player zero-sum paradigm of adversarial training has not engendered sufficient levels of robustness.
We show that the commonly used surrogate-based relaxation used in adversarial training algorithms voids all guarantees on robustness.
A novel non-zero-sum bilevel formulation of adversarial training yields a framework that matches and in some cases outperforms state-of-the-art attacks.
arXiv Detail & Related papers (2023-06-19T16:00:48Z) - Boosting Adversarial Robustness using Feature Level Stochastic Smoothing [46.86097477465267]
adversarial defenses have led to a significant improvement in the robustness of Deep Neural Networks.
In this work, we propose a generic method for introducingity in the network predictions.
We also utilize this for smoothing decision rejecting low confidence predictions.
arXiv Detail & Related papers (2023-06-10T15:11:24Z) - Revisiting DeepFool: generalization and improvement [17.714671419826715]
We introduce a new family of adversarial attacks that strike a balance between effectiveness and computational efficiency.
Our proposed attacks are also suitable for evaluating the robustness of large models.
arXiv Detail & Related papers (2023-03-22T11:49:35Z) - ADDMU: Detection of Far-Boundary Adversarial Examples with Data and
Model Uncertainty Estimation [125.52743832477404]
Adversarial Examples Detection (AED) is a crucial defense technique against adversarial attacks.
We propose a new technique, textbfADDMU, which combines two types of uncertainty estimation for both regular and FB adversarial example detection.
Our new method outperforms previous methods by 3.6 and 6.0 emphAUC points under each scenario.
arXiv Detail & Related papers (2022-10-22T09:11:12Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.