OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift
- URL: http://arxiv.org/abs/2310.12793v2
- Date: Tue, 4 Jun 2024 03:01:05 GMT
- Title: OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift
- Authors: Lin Li, Yifei Wang, Chawin Sitawarin, Michael Spratling,
- Abstract summary: OODRobustBench is used to assess 706 robust models using 60.7K adversarial evaluations.
This large-scale analysis shows that adversarial robustness suffers from a severe OOD generalization issue.
We then predict and verify that existing methods are unlikely to achieve high OOD robustness.
- Score: 20.14559162084261
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Existing works have made great progress in improving adversarial robustness, but typically test their method only on data from the same distribution as the training data, i.e. in-distribution (ID) testing. As a result, it is unclear how such robustness generalizes under input distribution shifts, i.e. out-of-distribution (OOD) testing. This omission is concerning as such distribution shifts are unavoidable when methods are deployed in the wild. To address this issue we propose a benchmark named OODRobustBench to comprehensively assess OOD adversarial robustness using 23 dataset-wise shifts (i.e. naturalistic shifts in input distribution) and 6 threat-wise shifts (i.e., unforeseen adversarial threat models). OODRobustBench is used to assess 706 robust models using 60.7K adversarial evaluations. This large-scale analysis shows that: 1) adversarial robustness suffers from a severe OOD generalization issue; 2) ID robustness correlates strongly with OOD robustness in a positive linear way. The latter enables the prediction of OOD robustness from ID robustness. We then predict and verify that existing methods are unlikely to achieve high OOD robustness. Novel methods are therefore required to achieve OOD robustness beyond our prediction. To facilitate the development of these methods, we investigate a wide range of techniques and identify several promising directions. Code and models are available at: https://github.com/OODRobustBench/OODRobustBench.
Related papers
- Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings [1.0260351016050424]
AROS is a novel approach leveraging neural ordinary differential equations (NODEs) with Lyapunov stability theorem.
By a tailored loss function, we apply Lyapunov stability theory to ensure that both in-distribution (ID) and OOD data converge to stable equilibrium points.
This approach encourages any perturbed input to return to its stable equilibrium, thereby enhancing the model's robustness against adversarial perturbations.
arXiv Detail & Related papers (2024-10-14T17:22:12Z) - The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection [75.65876949930258]
Out-of-distribution (OOD) detection is essential for model trustworthiness.
We show that the superior OOD detection performance of state-of-the-art methods is achieved by secretly sacrificing the OOD generalization ability.
arXiv Detail & Related papers (2024-10-12T07:02:04Z) - A Survey on Evaluation of Out-of-Distribution Generalization [41.39827887375374]
Out-of-Distribution (OOD) generalization is a complex and fundamental problem.
This paper serves as the first effort to conduct a comprehensive review of OOD evaluation.
We categorize existing research into three paradigms: OOD performance testing, OOD performance prediction, and OOD intrinsic property characterization.
arXiv Detail & Related papers (2024-03-04T09:30:35Z) - Model-free Test Time Adaptation for Out-Of-Distribution Detection [62.49795078366206]
We propose a Non-Parametric Test Time textbfAdaptation framework for textbfDistribution textbfDetection (abbr)
abbr utilizes online test samples for model adaptation during testing, enhancing adaptability to changing data distributions.
We demonstrate the effectiveness of abbr through comprehensive experiments on multiple OOD detection benchmarks.
arXiv Detail & Related papers (2023-11-28T02:00:47Z) - Your Out-of-Distribution Detection Method is Not Robust! [0.4893345190925178]
Out-of-distribution (OOD) detection has recently gained substantial attention due to the importance of identifying out-of-domain samples in reliability and safety.
To mitigate this issue, several defenses have recently been proposed.
We re-examine these defenses against an end-to-end PGD attack on in/out data with larger perturbation sizes.
arXiv Detail & Related papers (2022-09-30T05:49:00Z) - Calibrated ensembles can mitigate accuracy tradeoffs under distribution
shift [108.30303219703845]
We find that ID-calibrated ensembles outperforms prior state-of-the-art (based on self-training) on both ID and OOD accuracy.
We analyze this method in stylized settings, and identify two important conditions for ensembles to perform well both ID and OOD.
arXiv Detail & Related papers (2022-07-18T23:14:44Z) - Models Out of Line: A Fourier Lens on Distribution Shift Robustness [29.12208822285158]
Improving accuracy of deep neural networks (DNNs) on out-of-distribution (OOD) data is critical to an acceptance of deep learning (DL) in real world applications.
Recently, some promising approaches have been developed to improve OOD robustness.
There still is no clear understanding of the conditions on OOD data and model properties that are required to observe effective robustness.
arXiv Detail & Related papers (2022-07-08T18:05:58Z) - Provably Robust Detection of Out-of-distribution Data (almost) for free [124.14121487542613]
Deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data.
In this paper we propose a novel method where from first principles we combine a certifiable OOD detector with a standard classifier into an OOD aware classifier.
In this way we achieve the best of two worlds: certifiably adversarially robust OOD detection, even for OOD samples close to the in-distribution, without loss in prediction accuracy and close to state-of-the-art OOD detection performance for non-manipulated OOD data.
arXiv Detail & Related papers (2021-06-08T11:40:49Z) - Adversarial Robustness under Long-Tailed Distribution [93.50792075460336]
Adversarial robustness has attracted extensive studies recently by revealing the vulnerability and intrinsic characteristics of deep networks.
In this work we investigate the adversarial vulnerability as well as defense under long-tailed distributions.
We propose a clean yet effective framework, RoBal, which consists of two dedicated modules, a scale-invariant and data re-balancing.
arXiv Detail & Related papers (2021-04-06T17:53:08Z) - Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs.
We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.