Related papers: In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation

In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation

URL: http://arxiv.org/abs/2306.00826v1
Date: Thu, 1 Jun 2023 15:48:10 GMT
Title: In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation
Authors: Julian Bitterwolf, Maximilian M\"uller, Matthias Hein
Abstract summary: Out-of-distribution (OOD) detection is the problem of identifying inputs unrelated to the in-distribution task. Most of the currently used test OOD datasets, including datasets from the open set recognition (OSR) literature, have severe issues. We introduce with NINCO a novel test OOD dataset, each sample checked to be ID free, which allows for a detailed analysis of an OOD detector's strengths and failure modes.
Score: 43.865923770543205
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Out-of-distribution (OOD) detection is the problem of identifying inputs which are unrelated to the in-distribution task. The OOD detection performance when the in-distribution (ID) is ImageNet-1K is commonly being tested on a small range of test OOD datasets. We find that most of the currently used test OOD datasets, including datasets from the open set recognition (OSR) literature, have severe issues: In some cases more than 50$\%$ of the dataset contains objects belonging to one of the ID classes. These erroneous samples heavily distort the evaluation of OOD detectors. As a solution, we introduce with NINCO a novel test OOD dataset, each sample checked to be ID free, which with its fine-grained range of OOD classes allows for a detailed analysis of an OOD detector's strengths and failure modes, particularly when paired with a number of synthetic "OOD unit-tests". We provide detailed evaluations across a large set of architectures and OOD detection methods on NINCO and the unit-tests, revealing new insights about model weaknesses and the effects of pretraining on OOD detection performance. We provide code and data at https://github.com/j-cb/NINCO.

Related papers

Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox [70.57120710151105]
Most existing out-of-distribution (OOD) detection benchmarks classify samples with novel labels as the OOD data. Some marginal OOD samples actually have close semantic contents to the in-distribution (ID) sample, which makes determining the OOD sample a Sorites Paradox. We construct a benchmark named Incremental Shift OOD (IS-OOD) to address the issue.
arXiv Detail & Related papers (2024-06-14T09:27:56Z)
Negative Label Guided OOD Detection with Pretrained Vision-Language Models [96.67087734472912]
Out-of-distribution (OOD) detection aims at identifying samples from unknown classes. We propose a novel post hoc OOD detection method, called NegLabel, which takes a vast number of negative labels from extensive corpus databases.
arXiv Detail & Related papers (2024-03-29T09:19:52Z)
Detecting Out-of-Distribution Through the Lens of Neural Collapse [7.04686607977352]
Out-of-Distribution (OOD) detection is critical for safe deployment. Inspired by the phenomenon of Neural Collapse, we propose a versatile and efficient OOD detection method.
arXiv Detail & Related papers (2023-11-02T05:18:28Z)
SR-OOD: Out-of-Distribution Detection via Sample Repairing [48.272537939227206]
Out-of-distribution (OOD) detection is a crucial task for ensuring the reliability and robustness of machine learning models. Recent works have shown that generative models often assign high confidence scores to OOD samples, indicating that they fail to capture the semantic information of the data. We take advantage of sample repairing and propose a novel OOD detection framework, namely SR-OOD. Our framework achieves superior performance over the state-of-the-art generative methods in OOD detection.
arXiv Detail & Related papers (2023-05-26T16:35:20Z)
AUTO: Adaptive Outlier Optimization for Test-Time OOD Detection [79.51071170042972]
Out-of-distribution (OOD) detection aims to detect test samples that do not fall into any training in-distribution (ID) classes.<n>Data safety and privacy make it infeasible to collect task-specific outliers in advance for different scenarios.<n>We present test-time OOD detection, which allows the deployed model to utilize real OOD data from the unlabeled data stream during testing.
arXiv Detail & Related papers (2023-03-22T02:28:54Z)
Unsupervised Evaluation of Out-of-distribution Detection: A Data-centric Perspective [55.45202687256175]
Out-of-distribution (OOD) detection methods assume that they have test ground truths, i.e., whether individual test samples are in-distribution (IND) or OOD. In this paper, we are the first to introduce the unsupervised evaluation problem in OOD detection. We propose three methods to compute Gscore as an unsupervised indicator of OOD detection performance.
arXiv Detail & Related papers (2023-02-16T13:34:35Z)
On the Usefulness of Deep Ensemble Diversity for Out-of-Distribution Detection [7.221206118679026]
The ability to detect Out-of-Distribution (OOD) data is important in safety-critical applications of deep learning. An existing intuition in the literature is that the diversity of Deep Ensemble predictions indicates distributional shift. We show experimentally that this intuition is not valid on ImageNet-scale OOD detection.
arXiv Detail & Related papers (2022-07-15T15:02:38Z)
Augmenting Softmax Information for Selective Classification with Out-of-Distribution Data [7.221206118679026]
We show that existing post-hoc methods perform quite differently compared to when evaluated only on OOD detection. We propose a novel method for SCOD, Softmax Information Retaining Combination (SIRC), that augments softmax-based confidence scores with feature-agnostic information. Experiments on a wide variety of ImageNet-scale datasets and convolutional neural network architectures show that SIRC is able to consistently match or outperform the baseline for SCOD.
arXiv Detail & Related papers (2022-07-15T14:39:57Z)
PnPOOD : Out-Of-Distribution Detection for Text Classification via Plug andPlay Data Augmentation [25.276900899887192]
We present OOD, a data augmentation technique to perform OOD detection via out-of-domain sample generation. Our method generates high quality discriminative samples close to the class boundaries, resulting in accurate OOD detection at test time. We highlight an important data leakage issue with datasets used in prior attempts at OOD detection and share results on a new dataset for OOD detection that does not suffer from the same problem.
arXiv Detail & Related papers (2021-10-31T14:02:26Z)
Learn what you can't learn: Regularized Ensembles for Transductive Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios. This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data. We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.