Your Out-of-Distribution Detection Method is Not Robust!
- URL: http://arxiv.org/abs/2209.15246v1
- Date: Fri, 30 Sep 2022 05:49:00 GMT
- Title: Your Out-of-Distribution Detection Method is Not Robust!
- Authors: Mohammad Azizmalayeri, Arshia Soltani Moakhar, Arman Zarei, Reihaneh
Zohrabi, Mohammad Taghi Manzuri, Mohammad Hossein Rohban
- Abstract summary: Out-of-distribution (OOD) detection has recently gained substantial attention due to the importance of identifying out-of-domain samples in reliability and safety.
To mitigate this issue, several defenses have recently been proposed.
We re-examine these defenses against an end-to-end PGD attack on in/out data with larger perturbation sizes.
- Score: 0.4893345190925178
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Out-of-distribution (OOD) detection has recently gained substantial attention
due to the importance of identifying out-of-domain samples in reliability and
safety. Although OOD detection methods have advanced by a great deal, they are
still susceptible to adversarial examples, which is a violation of their
purpose. To mitigate this issue, several defenses have recently been proposed.
Nevertheless, these efforts remained ineffective, as their evaluations are
based on either small perturbation sizes, or weak attacks. In this work, we
re-examine these defenses against an end-to-end PGD attack on in/out data with
larger perturbation sizes, e.g. up to commonly used $\epsilon=8/255$ for the
CIFAR-10 dataset. Surprisingly, almost all of these defenses perform worse than
a random detection under the adversarial setting. Next, we aim to provide a
robust OOD detection method. In an ideal defense, the training should expose
the model to almost all possible adversarial perturbations, which can be
achieved through adversarial training. That is, such training perturbations
should based on both in- and out-of-distribution samples. Therefore, unlike OOD
detection in the standard setting, access to OOD, as well as in-distribution,
samples sounds necessary in the adversarial training setup. These tips lead us
to adopt generative OOD detection methods, such as OpenGAN, as a baseline. We
subsequently propose the Adversarially Trained Discriminator (ATD), which
utilizes a pre-trained robust model to extract robust features, and a generator
model to create OOD samples. Using ATD with CIFAR-10 and CIFAR-100 as the
in-distribution data, we could significantly outperform all previous methods in
the robust AUROC while maintaining high standard AUROC and classification
accuracy. The code repository is available at https://github.com/rohban-lab/ATD .
Related papers
- OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift [20.14559162084261]
OODRobustBench is used to assess 706 robust models using 60.7K adversarial evaluations.
This large-scale analysis shows that adversarial robustness suffers from a severe OOD generalization issue.
We then predict and verify that existing methods are unlikely to achieve high OOD robustness.
arXiv Detail & Related papers (2023-10-19T14:50:46Z) - Diffusion Denoised Smoothing for Certified and Adversarial Robust
Out-Of-Distribution Detection [6.247268652296234]
We present a novel approach for certifying the robustness of OOD detection within a $ell$-norm around the input.
We improve current techniques for detecting adversarial attacks on OOD samples, while providing high levels of certified and adversarial robustness on in-distribution samples.
arXiv Detail & Related papers (2023-03-27T07:52:58Z) - Out-of-Distribution Detection with Hilbert-Schmidt Independence
Optimization [114.43504951058796]
Outlier detection tasks have been playing a critical role in AI safety.
Deep neural network classifiers usually tend to incorrectly classify out-of-distribution (OOD) inputs into in-distribution classes with high confidence.
We propose an alternative probabilistic paradigm that is both practically useful and theoretically viable for the OOD detection tasks.
arXiv Detail & Related papers (2022-09-26T15:59:55Z) - Partial and Asymmetric Contrastive Learning for Out-of-Distribution
Detection in Long-Tailed Recognition [80.07843757970923]
We show that existing OOD detection methods suffer from significant performance degradation when the training set is long-tail distributed.
We propose Partial and Asymmetric Supervised Contrastive Learning (PASCL), which explicitly encourages the model to distinguish between tail-class in-distribution samples and OOD samples.
Our method outperforms previous state-of-the-art method by $1.29%$, $1.45%$, $0.69%$ anomaly detection false positive rate (FPR) and $3.24%$, $4.06%$, $7.89%$ in-distribution
arXiv Detail & Related papers (2022-07-04T01:53:07Z) - Out-of-distribution Detection with Deep Nearest Neighbors [33.71627349163909]
Out-of-distribution (OOD) detection is a critical task for deploying machine learning models in the open world.
In this paper, we explore the efficacy of non-parametric nearest-neighbor distance for OOD detection.
We demonstrate the effectiveness of nearest-neighbor-based OOD detection on several benchmarks and establish superior performance.
arXiv Detail & Related papers (2022-04-13T16:45:21Z) - Provably Robust Detection of Out-of-distribution Data (almost) for free [124.14121487542613]
Deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data.
In this paper we propose a novel method where from first principles we combine a certifiable OOD detector with a standard classifier into an OOD aware classifier.
In this way we achieve the best of two worlds: certifiably adversarially robust OOD detection, even for OOD samples close to the in-distribution, without loss in prediction accuracy and close to state-of-the-art OOD detection performance for non-manipulated OOD data.
arXiv Detail & Related papers (2021-06-08T11:40:49Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs.
We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.