RobustBench: a standardized adversarial robustness benchmark
- URL: http://arxiv.org/abs/2010.09670v3
- Date: Sun, 31 Oct 2021 20:03:39 GMT
- Title: RobustBench: a standardized adversarial robustness benchmark
- Authors: Francesco Croce, Maksym Andriushchenko, Vikash Sehwag, Edoardo
Debenedetti, Nicolas Flammarion, Mung Chiang, Prateek Mittal, Matthias Hein
- Abstract summary: Key challenge in benchmarking robustness is that its evaluation is often error-prone leading to robustness overestimation.
We evaluate adversarial robustness with AutoAttack, an ensemble of white- and black-box attacks.
We analyze the impact of robustness on the performance on distribution shifts, calibration, out-of-distribution detection, fairness, privacy leakage, smoothness, and transferability.
- Score: 84.50044645539305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a research community, we are still lacking a systematic understanding of
the progress on adversarial robustness which often makes it hard to identify
the most promising ideas in training robust models. A key challenge in
benchmarking robustness is that its evaluation is often error-prone leading to
robustness overestimation. Our goal is to establish a standardized benchmark of
adversarial robustness, which as accurately as possible reflects the robustness
of the considered models within a reasonable computational budget. To this end,
we start by considering the image classification task and introduce
restrictions (possibly loosened in the future) on the allowed models. We
evaluate adversarial robustness with AutoAttack, an ensemble of white- and
black-box attacks, which was recently shown in a large-scale study to improve
almost all robustness evaluations compared to the original publications. To
prevent overadaptation of new defenses to AutoAttack, we welcome external
evaluations based on adaptive attacks, especially where AutoAttack flags a
potential overestimation of robustness. Our leaderboard, hosted at
https://robustbench.github.io/, contains evaluations of 120+ models and aims at
reflecting the current state of the art in image classification on a set of
well-defined tasks in $\ell_\infty$- and $\ell_2$-threat models and on common
corruptions, with possible extensions in the future. Additionally, we
open-source the library https://github.com/RobustBench/robustbench that
provides unified access to 80+ robust models to facilitate their downstream
applications. Finally, based on the collected models, we analyze the impact of
robustness on the performance on distribution shifts, calibration,
out-of-distribution detection, fairness, privacy leakage, smoothness, and
transferability.
Related papers
- Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations [80.86128012438834]
We show for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete.
We propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees.
arXiv Detail & Related papers (2024-07-10T09:13:11Z) - Characterizing Data Point Vulnerability via Average-Case Robustness [29.881355412540557]
adversarial robustness is a standard framework, which views robustness of predictions through a binary lens.
We consider a complementary framework for robustness, called average-case robustness, which measures the fraction of points in a local region.
We show empirically that our estimators are accurate and efficient for standard deep learning models.
arXiv Detail & Related papers (2023-07-26T01:10:29Z) - From Adversarial Arms Race to Model-centric Evaluation: Motivating a
Unified Automatic Robustness Evaluation Framework [91.94389491920309]
Textual adversarial attacks can discover models' weaknesses by adding semantic-preserved but misleading perturbations to the inputs.
The existing practice of robustness evaluation may exhibit issues of incomprehensive evaluation, impractical evaluation protocol, and invalid adversarial samples.
We set up a unified automatic robustness evaluation framework, shifting towards model-centric evaluation to exploit the advantages of adversarial attacks.
arXiv Detail & Related papers (2023-05-29T14:55:20Z) - A Comprehensive Study on Robustness of Image Classification Models:
Benchmarking and Rethinking [54.89987482509155]
robustness of deep neural networks is usually lacking under adversarial examples, common corruptions, and distribution shifts.
We establish a comprehensive benchmark robustness called textbfARES-Bench on the image classification task.
By designing the training settings accordingly, we achieve the new state-of-the-art adversarial robustness.
arXiv Detail & Related papers (2023-02-28T04:26:20Z) - Is RobustBench/AutoAttack a suitable Benchmark for Adversarial
Robustness? [20.660465258314314]
We argue that the alternation of data by AutoAttack with l-inf, eps = 8/255 is unrealistically strong, resulting in close to perfect detection rates of adversarial samples.
We also show that other attack methods are much harder to detect while achieving similar success rates.
arXiv Detail & Related papers (2021-12-02T20:44:16Z) - Clustering Effect of (Linearized) Adversarial Robust Models [60.25668525218051]
We propose a novel understanding of adversarial robustness and apply it on more tasks including domain adaption and robustness boosting.
Experimental evaluations demonstrate the rationality and superiority of our proposed clustering strategy.
arXiv Detail & Related papers (2021-11-25T05:51:03Z) - A Comprehensive Evaluation Framework for Deep Model Robustness [44.20580847861682]
Deep neural networks (DNNs) have achieved remarkable performance across a wide area of applications.
They are vulnerable to adversarial examples, which motivates the adversarial defense.
This paper presents a model evaluation framework containing a comprehensive, rigorous, and coherent set of evaluation metrics.
arXiv Detail & Related papers (2021-01-24T01:04:25Z) - A general framework for defining and optimizing robustness [74.67016173858497]
We propose a rigorous and flexible framework for defining different types of robustness properties for classifiers.
Our concept is based on postulates that robustness of a classifier should be considered as a property that is independent of accuracy.
We develop a very general robustness framework that is applicable to any type of classification model.
arXiv Detail & Related papers (2020-06-19T13:24:20Z) - Benchmarking Robustness of Machine Reading Comprehension Models [29.659586787812106]
We construct AdvRACE, a new model-agnostic benchmark for evaluating the robustness of MRC models under four different types of adversarial attacks.
We show that state-of-the-art (SOTA) models are vulnerable to all of these attacks.
We conclude that there is substantial room for building more robust MRC models and our benchmark can help motivate and measure progress in this area.
arXiv Detail & Related papers (2020-04-29T08:05:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.