Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox
- URL: http://arxiv.org/abs/2406.09867v1
- Date: Fri, 14 Jun 2024 09:27:56 GMT
- Title: Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox
- Authors: Xingming Long, Jie Zhang, Shiguang Shan, Xilin Chen,
- Abstract summary: Most existing out-of-distribution (OOD) detection benchmarks classify samples with novel labels as the OOD data.
Some marginal OOD samples actually have close semantic contents to the in-distribution (ID) sample, which makes determining the OOD sample a Sorites Paradox.
We construct a benchmark named Incremental Shift OOD (IS-OOD) to address the issue.
- Score: 70.57120710151105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing out-of-distribution (OOD) detection benchmarks classify samples with novel labels as the OOD data. However, some marginal OOD samples actually have close semantic contents to the in-distribution (ID) sample, which makes determining the OOD sample a Sorites Paradox. In this paper, we construct a benchmark named Incremental Shift OOD (IS-OOD) to address the issue, in which we divide the test samples into subsets with different semantic and covariate shift degrees relative to the ID dataset. The data division is achieved through a shift measuring method based on our proposed Language Aligned Image feature Decomposition (LAID). Moreover, we construct a Synthetic Incremental Shift (Syn-IS) dataset that contains high-quality generated images with more diverse covariate contents to complement the IS-OOD benchmark. We evaluate current OOD detection methods on our benchmark and find several important insights: (1) The performance of most OOD detection methods significantly improves as the semantic shift increases; (2) Some methods like GradNorm may have different OOD detection mechanisms as they rely less on semantic shifts to make decisions; (3) Excessive covariate shifts in the image are also likely to be considered as OOD for some methods. Our code and data are released in https://github.com/qqwsad5/IS-OOD.
Related papers
- Out-of-Distribution Knowledge Distillation via Confidence Amendment [50.56321442948141]
Out-of-distribution (OOD) detection is essential in identifying test samples that deviate from the in-distribution (ID) data upon which a standard network is trained.
This paper introduces OOD knowledge distillation, a pioneering learning framework applicable whether or not training ID data is available.
This framework harnesses OOD-sensitive knowledge from the standard network to craft a binary classifier adept at distinguishing between ID and OOD samples.
arXiv Detail & Related papers (2023-11-14T08:05:02Z) - General-Purpose Multi-Modal OOD Detection Framework [5.287829685181842]
Out-of-distribution (OOD) detection identifies test samples that differ from the training data, which is critical to ensuring the safety and reliability of machine learning (ML) systems.
We propose a general-purpose weakly-supervised OOD detection framework, called WOOD, that combines a binary classifier and a contrastive learning component.
We evaluate the proposed WOOD model on multiple real-world datasets, and the experimental results demonstrate that the WOOD model outperforms the state-of-the-art methods for multi-modal OOD detection.
arXiv Detail & Related papers (2023-07-24T18:50:49Z) - Pseudo Outlier Exposure for Out-of-Distribution Detection using
Pretrained Transformers [3.8839179829686126]
A rejection network can be trained with ID and diverse outlier samples to detect test OOD samples.
We propose a method called Pseudo Outlier Exposure (POE) that constructs a surrogate OOD dataset by sequentially masking tokens related to ID classes.
Our method does not require any external OOD data and can be easily implemented within off-the-shelf Transformers.
arXiv Detail & Related papers (2023-07-18T17:29:23Z) - Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning [17.409939628100517]
We propose a unified framework termed OOD Semantic Pruning (OSP), which aims at pruning OOD semantics out from in-distribution (ID) features.
OSP surpasses the previous state-of-the-art by 13.7% on accuracy for ID classification and 5.9% on AUROC for OOD detection on TinyImageNet dataset.
arXiv Detail & Related papers (2023-05-29T15:37:07Z) - Background Matters: Enhancing Out-of-distribution Detection with Domain
Features [90.32910087103744]
OOD samples can be drawn from arbitrary distributions and exhibit deviations from in-distribution (ID) data in various dimensions.
Existing methods focus on detecting OOD samples based on the semantic features, while neglecting the other dimensions such as the domain features.
This paper proposes a novel generic framework that can learn the domain features from the ID training samples by a dense prediction approach.
arXiv Detail & Related papers (2023-03-15T16:12:14Z) - Unsupervised Evaluation of Out-of-distribution Detection: A Data-centric
Perspective [55.45202687256175]
Out-of-distribution (OOD) detection methods assume that they have test ground truths, i.e., whether individual test samples are in-distribution (IND) or OOD.
In this paper, we are the first to introduce the unsupervised evaluation problem in OOD detection.
We propose three methods to compute Gscore as an unsupervised indicator of OOD detection performance.
arXiv Detail & Related papers (2023-02-16T13:34:35Z) - Estimating Soft Labels for Out-of-Domain Intent Detection [122.68266151023676]
Out-of-Domain (OOD) intent detection is important for practical dialog systems.
We propose an adaptive soft pseudo labeling (ASoul) method that can estimate soft labels for pseudo OOD samples.
arXiv Detail & Related papers (2022-11-10T13:31:13Z) - Full-Spectrum Out-of-Distribution Detection [42.98617540431124]
We take into account both shift types and introduce full-spectrum OOD (FS-OOD) detection.
We propose SEM, a simple feature-based semantics score function.
SEM significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2022-04-11T17:59:14Z) - Exploring Covariate and Concept Shift for Detection and Calibration of
Out-of-Distribution Data [77.27338842609153]
characterization reveals that sensitivity to each type of shift is important to the detection and confidence calibration of OOD data.
We propose a geometrically-inspired method to improve OOD detection under both shifts with only in-distribution data.
We are the first to propose a method that works well across both OOD detection and calibration and under different types of shifts.
arXiv Detail & Related papers (2021-10-28T15:42:55Z) - Semantically Coherent Out-of-Distribution Detection [26.224146828317277]
Current out-of-distribution (OOD) detection benchmarks are commonly built by defining one dataset as in-distribution (ID) and all others as OOD.
We re-design the benchmarks and propose the semantically coherent out-of-distribution detection (SC-OOD)
Our approach achieves state-of-the-art performance on SC-OOD benchmarks.
arXiv Detail & Related papers (2021-08-26T17:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.