Related papers: Adversarial vulnerability of powerful near out-of-distribution detection

Adversarial vulnerability of powerful near out-of-distribution detection

URL: http://arxiv.org/abs/2201.07012v1
Date: Tue, 18 Jan 2022 14:23:07 GMT
Title: Adversarial vulnerability of powerful near out-of-distribution detection
Authors: Stanislav Fort
Abstract summary: We show a severe adversarial vulnerability of even the strongest current OOD detection techniques. With a small, targeted perturbation to the input pixels, we can change the image assignment from an in-distribution to an out-distribution, easily.
Score: 8.446798721296906
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: There has been a significant progress in detecting out-of-distribution (OOD) inputs in neural networks recently, primarily due to the use of large models pretrained on large datasets, and an emerging use of multi-modality. We show a severe adversarial vulnerability of even the strongest current OOD detection techniques. With a small, targeted perturbation to the input pixels, we can change the image assignment from an in-distribution to an out-distribution, and vice versa, easily. In particular, we demonstrate severe adversarial vulnerability on the challenging near OOD CIFAR-100 vs CIFAR-10 task, as well as on the far OOD CIFAR-100 vs SVHN. We study the adversarial robustness of several post-processing techniques, including the simple baseline of Maximum of Softmax Probabilities (MSP), the Mahalanobis distance, and the newly proposed \textit{Relative} Mahalanobis distance. By comparing the loss of OOD detection performance at various perturbation strengths, we demonstrate the beneficial effect of using ensembles of OOD detectors, and the use of the \textit{Relative} Mahalanobis distance over other post-processing methods. In addition, we show that even strong zero-shot OOD detection using CLIP and multi-modality suffers from a severe lack of adversarial robustness as well. Our code is available at https://github.com/stanislavfort/adversaries_to_OOD_detection

Related papers

DisCoPatch: Batch Statistics Are All You Need For OOD Detection, But Only If You Can Trust Them [7.0477485974331895]
Out-of-distribution (OOD) detection holds significant importance across many applications. We introduce DisCoPatch, an unsupervised Adversarial Variational Autoencoder framework that harnesses this mechanism. DisCoPatch achieves state-of-the-art results in public OOD detection benchmarks.
arXiv Detail & Related papers (2025-01-14T10:49:26Z)
The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection [75.65876949930258]
Out-of-distribution (OOD) detection is essential for model trustworthiness. We show that the superior OOD detection performance of state-of-the-art methods is achieved by secretly sacrificing the OOD generalization ability.
arXiv Detail & Related papers (2024-10-12T07:02:04Z)
How to Overcome Curse-of-Dimensionality for Out-of-Distribution Detection? [29.668859994222238]
We propose a novel framework, Subspace Nearest Neighbor (SNN), for OOD detection. In training, our method regularizes the model and its feature representation by leveraging the most relevant subset of dimensions. Compared to the current best distance-based method, SNN reduces the average FPR95 by 15.96% on the CIFAR-100 benchmark.
arXiv Detail & Related papers (2023-12-22T06:04:09Z)
Your Out-of-Distribution Detection Method is Not Robust! [0.4893345190925178]
Out-of-distribution (OOD) detection has recently gained substantial attention due to the importance of identifying out-of-domain samples in reliability and safety. To mitigate this issue, several defenses have recently been proposed. We re-examine these defenses against an end-to-end PGD attack on in/out data with larger perturbation sizes.
arXiv Detail & Related papers (2022-09-30T05:49:00Z)
A Simple Test-Time Method for Out-of-Distribution Detection [45.11199798139358]
This paper proposes a simple Test-time Linear Training (ETLT) method for OOD detection. We find that the probabilities of input images being out-of-distribution are surprisingly linearly correlated to the features extracted by neural networks. We propose an online variant of the proposed method, which achieves promising performance and is more practical in real-world applications.
arXiv Detail & Related papers (2022-07-17T16:02:58Z)
Out-of-distribution Detection with Deep Nearest Neighbors [33.71627349163909]
Out-of-distribution (OOD) detection is a critical task for deploying machine learning models in the open world. In this paper, we explore the efficacy of non-parametric nearest-neighbor distance for OOD detection. We demonstrate the effectiveness of nearest-neighbor-based OOD detection on several benchmarks and establish superior performance.
arXiv Detail & Related papers (2022-04-13T16:45:21Z)
Provably Robust Detection of Out-of-distribution Data (almost) for free [124.14121487542613]
Deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data. In this paper we propose a novel method where from first principles we combine a certifiable OOD detector with a standard classifier into an OOD aware classifier. In this way we achieve the best of two worlds: certifiably adversarially robust OOD detection, even for OOD samples close to the in-distribution, without loss in prediction accuracy and close to state-of-the-art OOD detection performance for non-manipulated OOD data.
arXiv Detail & Related papers (2021-06-08T11:40:49Z)
Learn what you can't learn: Regularized Ensembles for Transductive Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios. This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data. We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z)
Probing Predictions on OOD Images via Nearest Categories [97.055916832257]
We study out-of-distribution (OOD) prediction behavior of neural networks when they classify images from unseen classes or corrupted images. We introduce a new measure, nearest category generalization (NCG), where we compute the fraction of OOD inputs that are classified with the same label as their nearest neighbor in the training set. We find that robust networks have consistently higher NCG accuracy than natural training, even when the OOD data is much farther away than the robustness radius.
arXiv Detail & Related papers (2020-11-17T07:42:27Z)
ATOM: Robustifying Out-of-distribution Detection Using Outlier Mining [51.19164318924997]
Adrial Training with informative Outlier Mining improves robustness of OOD detection. ATOM achieves state-of-the-art performance under a broad family of classic and adversarial OOD evaluation tasks.
arXiv Detail & Related papers (2020-06-26T20:58:05Z)
Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs. We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.