OpenOOD: Benchmarking Generalized Out-of-Distribution Detection
- URL: http://arxiv.org/abs/2210.07242v1
- Date: Thu, 13 Oct 2022 17:59:57 GMT
- Title: OpenOOD: Benchmarking Generalized Out-of-Distribution Detection
- Authors: Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding,
Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du,
Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu
- Abstract summary: Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications.
The field currently lacks a unified, strictly formulated, and comprehensive benchmark.
We build a unified, well-structured called OpenOOD, which implements over 30 methods developed in relevant fields.
- Score: 60.13300701826931
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Out-of-distribution (OOD) detection is vital to safety-critical machine
learning applications and has thus been extensively studied, with a plethora of
methods developed in the literature. However, the field currently lacks a
unified, strictly formulated, and comprehensive benchmark, which often results
in unfair comparisons and inconclusive results. From the problem setting
perspective, OOD detection is closely related to neighboring fields including
anomaly detection (AD), open set recognition (OSR), and model uncertainty,
since methods developed for one domain are often applicable to each other. To
help the community to improve the evaluation and advance, we build a unified,
well-structured codebase called OpenOOD, which implements over 30 methods
developed in relevant fields and provides a comprehensive benchmark under the
recently proposed generalized OOD detection framework. With a comprehensive
comparison of these methods, we are gratified that the field has progressed
significantly over the past few years, where both preprocessing methods and the
orthogonal post-hoc methods show strong potential.
Related papers
- Margin-bounded Confidence Scores for Out-of-Distribution Detection [2.373572816573706]
We propose a novel method called Margin bounded Confidence Scores (MaCS) to address the nontrivial OOD detection problem.
MaCS enlarges the disparity between ID and OOD scores, which in turn makes the decision boundary more compact.
Experiments on various benchmark datasets for image classification tasks demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-09-22T05:40:25Z) - Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks [17.520137576423593]
We aim to provide a consolidated view of the two largest sub-fields within the community: out-of-distribution (OOD) detection and open-set recognition (OSR)
We perform rigorous cross-evaluation between state-of-the-art methods in the OOD detection and OSR settings and identify a strong correlation between the performances of methods for them.
We propose a new, large-scale benchmark setting which we suggest better disentangles the problem tackled by OOD detection and OSR.
arXiv Detail & Related papers (2024-08-29T17:55:07Z) - Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection [9.656342063882555]
We study five types of distribution shifts and evaluate the performance of recent OOD detection methods on each of them.
Our findings reveal that while these methods excel in detecting unknown classes, their performance is inconsistent when encountering other types of distribution shifts.
We present an ensemble approach that offers a more consistent and comprehensive solution for broad OOD detection.
arXiv Detail & Related papers (2023-08-22T14:52:44Z) - Beyond AUROC & co. for evaluating out-of-distribution detection
performance [50.88341818412508]
Given their relevance for safe(r) AI, it is important to examine whether the basis for comparing OOD detection methods is consistent with practical needs.
We propose a new metric - Area Under the Threshold Curve (AUTC), which explicitly penalizes poor separation between ID and OOD samples.
arXiv Detail & Related papers (2023-06-26T12:51:32Z) - Plugin estimators for selective classification with out-of-distribution
detection [67.28226919253214]
Real-world classifiers can benefit from abstaining from predicting on samples where they have low confidence.
These settings have been the subject of extensive but disjoint study in the selective classification (SC) and out-of-distribution (OOD) detection literature.
Recent work on selective classification with OOD detection has argued for the unified study of these problems.
We propose new plugin estimators for SCOD that are theoretically grounded, effective, and generalise existing approaches.
arXiv Detail & Related papers (2023-01-29T07:45:17Z) - Pseudo-OOD training for robust language models [78.15712542481859]
OOD detection is a key component of a reliable machine-learning model for any industry-scale application.
We propose POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data.
We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.
arXiv Detail & Related papers (2022-10-17T14:32:02Z) - Generalized Out-of-Distribution Detection: A Survey [83.0449593806175]
Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems.
Several other problems, including anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD) are closely related to OOD detection.
We first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems.
arXiv Detail & Related papers (2021-10-21T17:59:41Z) - Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs.
We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.