LoD: Loss-difference OOD Detection by Intentionally Label-Noisifying Unlabeled Wild Data
- URL: http://arxiv.org/abs/2505.12952v1
- Date: Mon, 19 May 2025 10:44:52 GMT
- Title: LoD: Loss-difference OOD Detection by Intentionally Label-Noisifying Unlabeled Wild Data
- Authors: Chuanxing Geng, Qifei Li, Xinrui Wang, Dong Liang, Songcan Chen, Pong C. Yuen,
- Abstract summary: We propose a novel loss-difference OOD detection framework (LoD) by textitintentionally label-noisifying unlabeled wild data.<n>Such operations not only enable labeled ID data and OOD data in unlabeled wild data to jointly dominate the models' learning but also ensure the distinguishability of the losses between ID and OOD samples in unlabeled wild data.
- Score: 45.32174880762807
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Using unlabeled wild data containing both in-distribution (ID) and out-of-distribution (OOD) data to improve the safety and reliability of models has recently received increasing attention. Existing methods either design customized losses for labeled ID and unlabeled wild data then perform joint optimization, or first filter out OOD data from the latter then learn an OOD detector. While achieving varying degrees of success, two potential issues remain: (i) Labeled ID data typically dominates the learning of models, inevitably making models tend to fit OOD data as IDs; (ii) The selection of thresholds for identifying OOD data in unlabeled wild data usually faces dilemma due to the unavailability of pure OOD samples. To address these issues, we propose a novel loss-difference OOD detection framework (LoD) by \textit{intentionally label-noisifying} unlabeled wild data. Such operations not only enable labeled ID data and OOD data in unlabeled wild data to jointly dominate the models' learning but also ensure the distinguishability of the losses between ID and OOD samples in unlabeled wild data, allowing the classic clustering technique (e.g., K-means) to filter these OOD samples without requiring thresholds any longer. We also provide theoretical foundation for LoD's viability, and extensive experiments verify its superiority.
Related papers
- How Does Unlabeled Data Provably Help Out-of-Distribution Detection? [63.41681272937562]
Unlabeled in-the-wild data is non-trivial due to the heterogeneity of both in-distribution (ID) and out-of-distribution (OOD) data.
This paper introduces a new learning framework SAL (Separate And Learn) that offers both strong theoretical guarantees and empirical effectiveness.
arXiv Detail & Related papers (2024-02-05T20:36:33Z) - Out-of-distribution Detection Learning with Unreliable
Out-of-distribution Sources [73.28967478098107]
Out-of-distribution (OOD) detection discerns OOD data where the predictor cannot make valid predictions as in-distribution (ID) data.
It is typically hard to collect real out-of-distribution (OOD) data for training a predictor capable of discerning OOD patterns.
We propose a data generation-based learning method named Auxiliary Task-based OOD Learning (ATOL) that can relieve the mistaken OOD generation.
arXiv Detail & Related papers (2023-11-06T16:26:52Z) - GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection [67.90365841083951]
We develop a new graph contrastive learning framework GOOD-D for detecting OOD graphs without using any ground-truth labels.
GOOD-D is able to capture the latent ID patterns and accurately detect OOD graphs based on the semantic inconsistency in different granularities.
As a pioneering work in unsupervised graph-level OOD detection, we build a comprehensive benchmark to compare our proposed approach with different state-of-the-art methods.
arXiv Detail & Related papers (2022-11-08T12:41:58Z) - Exploiting Mixed Unlabeled Data for Detecting Samples of Seen and Unseen
Out-of-Distribution Classes [5.623232537411766]
Out-of-Distribution (OOD) detection is essential in real-world applications, which has attracted increasing attention in recent years.
Most existing OOD detection methods require many labeled In-Distribution (ID) data, causing a heavy labeling cost.
In this paper, we focus on the more realistic scenario, where limited labeled data and abundant unlabeled data are available.
We propose the Adaptive In-Out-aware Learning (AIOL) method, in which we adaptively select potential ID and OOD samples from the mixed unlabeled data.
arXiv Detail & Related papers (2022-10-13T08:34:25Z) - Augmenting Softmax Information for Selective Classification with
Out-of-Distribution Data [7.221206118679026]
We show that existing post-hoc methods perform quite differently compared to when evaluated only on OOD detection.
We propose a novel method for SCOD, Softmax Information Retaining Combination (SIRC), that augments softmax-based confidence scores with feature-agnostic information.
Experiments on a wide variety of ImageNet-scale datasets and convolutional neural network architectures show that SIRC is able to consistently match or outperform the baseline for SCOD.
arXiv Detail & Related papers (2022-07-15T14:39:57Z) - Supervision Adaptation Balancing In-distribution Generalization and
Out-of-distribution Detection [36.66825830101456]
In-distribution (ID) and out-of-distribution (OOD) samples can lead to textitdistributional vulnerability in deep neural networks.
We introduce a novel textitsupervision adaptation approach to generate adaptive supervision information for OOD samples, making them more compatible with ID samples.
arXiv Detail & Related papers (2022-06-19T11:16:44Z) - Training OOD Detectors in their Natural Habitats [31.565635192716712]
Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild.
Recent methods use auxiliary outlier data to regularize the model for improved OOD detection.
We propose a novel framework that leverages wild mixture data -- that naturally consists of both ID and OOD samples.
arXiv Detail & Related papers (2022-02-07T15:38:39Z) - Provably Robust Detection of Out-of-distribution Data (almost) for free [124.14121487542613]
Deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data.
In this paper we propose a novel method where from first principles we combine a certifiable OOD detector with a standard classifier into an OOD aware classifier.
In this way we achieve the best of two worlds: certifiably adversarially robust OOD detection, even for OOD samples close to the in-distribution, without loss in prediction accuracy and close to state-of-the-art OOD detection performance for non-manipulated OOD data.
arXiv Detail & Related papers (2021-06-08T11:40:49Z) - They are Not Completely Useless: Towards Recycling Transferable
Unlabeled Data for Class-Mismatched Semi-Supervised Learning [61.46572463531167]
Semi-Supervised Learning (SSL) with mismatched classes deals with the problem that the classes-of-interests in the limited labeled data is only a subset of the classes in massive unlabeled data.
This paper proposes a "Transferable OOD data Recycling" (TOOR) method to enrich the information for conducting class-mismatched SSL.
arXiv Detail & Related papers (2020-11-27T02:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.