Predictive Sample Assignment for Semantically Coherent Out-of-Distribution Detection
- URL: http://arxiv.org/abs/2512.12906v1
- Date: Mon, 15 Dec 2025 01:18:38 GMT
- Title: Predictive Sample Assignment for Semantically Coherent Out-of-Distribution Detection
- Authors: Zhimao Peng, Enguang Wang, Xialei Liu, Ming-Ming Cheng,
- Abstract summary: Semantically coherent out-of-distribution detection (SCOOD) is a recently proposed realistic OOD detection setting.<n>We propose a concise SCOOD framework based on predictive sample assignment (PSA)<n>Our approach outperforms the state-of-the-art methods by a significant margin.
- Score: 62.1052001316508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantically coherent out-of-distribution detection (SCOOD) is a recently proposed realistic OOD detection setting: given labeled in-distribution (ID) data and mixed in-distribution and out-of-distribution unlabeled data as the training data, SCOOD aims to enable the trained model to accurately identify OOD samples in the testing data. Current SCOOD methods mainly adopt various clustering-based in-distribution sample filtering (IDF) strategies to select clean ID samples from unlabeled data, and take the remaining samples as auxiliary OOD data, which inevitably introduces a large number of noisy samples in training. To address the above issue, we propose a concise SCOOD framework based on predictive sample assignment (PSA). PSA includes a dual-threshold ternary sample assignment strategy based on the predictive energy score that can significantly improve the purity of the selected ID and OOD sample sets by assigning unconfident unlabeled data to an additional discard sample set, and a concept contrastive representation learning loss to further expand the distance between ID and OOD samples in the representation space to assist ID/OOD discrimination. In addition, we also introduce a retraining strategy to help the model fully fit the selected auxiliary ID/OOD samples. Experiments on two standard SCOOD benchmarks demonstrate that our approach outperforms the state-of-the-art methods by a significant margin.
Related papers
- ID-like Prompt Learning for Few-Shot Out-of-Distribution Detection [47.16254775587534]
We propose a novel OOD detection framework that discovers idlike outliers using CLIP citeDBLP:conf/icml/RadfordKHRGASAM21.
Benefiting from the powerful CLIP, we only need a small number of ID samples to learn the prompts of the model.
Our method achieves superior few-shot learning performance on various real-world image datasets.
arXiv Detail & Related papers (2023-11-26T09:06:40Z) - Distilling the Unknown to Unveil Certainty [66.29929319664167]
Out-of-distribution (OOD) detection is critical for identifying test samples that deviate from in-distribution (ID) data, ensuring network robustness and reliability.<n>This paper presents a flexible framework for OOD knowledge distillation that extracts OOD-sensitive information from a network to develop a binary classifier capable of distinguishing between ID and OOD samples.
arXiv Detail & Related papers (2023-11-14T08:05:02Z) - Pseudo Outlier Exposure for Out-of-Distribution Detection using
Pretrained Transformers [3.8839179829686126]
A rejection network can be trained with ID and diverse outlier samples to detect test OOD samples.
We propose a method called Pseudo Outlier Exposure (POE) that constructs a surrogate OOD dataset by sequentially masking tokens related to ID classes.
Our method does not require any external OOD data and can be easily implemented within off-the-shelf Transformers.
arXiv Detail & Related papers (2023-07-18T17:29:23Z) - Exploration and Exploitation of Unlabeled Data for Open-Set
Semi-Supervised Learning [130.56124475528475]
We address a complex but practical scenario in semi-supervised learning (SSL) named open-set SSL, where unlabeled data contain both in-distribution (ID) and out-of-distribution (OOD) samples.
Our proposed method achieves state-of-the-art in several challenging benchmarks, and improves upon existing SSL methods even when ID samples are totally absent in unlabeled data.
arXiv Detail & Related papers (2023-06-30T14:25:35Z) - Supervision Adaptation Balancing In-distribution Generalization and
Out-of-distribution Detection [36.66825830101456]
In-distribution (ID) and out-of-distribution (OOD) samples can lead to textitdistributional vulnerability in deep neural networks.
We introduce a novel textitsupervision adaptation approach to generate adaptive supervision information for OOD samples, making them more compatible with ID samples.
arXiv Detail & Related papers (2022-06-19T11:16:44Z) - Understanding, Detecting, and Separating Out-of-Distribution Samples and
Adversarial Samples in Text Classification [80.81532239566992]
We compare the two types of anomalies (OOD and Adv samples) with the in-distribution (ID) ones from three aspects.
We find that OOD samples expose their aberration starting from the first layer, while the abnormalities of Adv samples do not emerge until the deeper layers of the model.
We propose a simple method to separate ID, OOD, and Adv samples using the hidden representations and output probabilities of the model.
arXiv Detail & Related papers (2022-04-09T12:11:59Z) - WOOD: Wasserstein-based Out-of-Distribution Detection [6.163329453024915]
Training data for deep-neural-network-based classifiers are usually assumed to be sampled from the same distribution.
When part of the test samples are drawn from a distribution that is far away from that of the training samples, the trained neural network has a tendency to make high confidence predictions for these OOD samples.
We propose a Wasserstein-based out-of-distribution detection (WOOD) method to overcome these challenges.
arXiv Detail & Related papers (2021-12-13T02:35:15Z) - Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection [72.35532598131176]
We propose an unsupervised method to detect OOD samples using a $k$-NN density estimate.
We leverage a recent insight about label smoothing, which we call the emphLabel Smoothed Embedding Hypothesis
We show that our proposal outperforms many OOD baselines and also provide new finite-sample high-probability statistical results.
arXiv Detail & Related papers (2021-02-09T21:04:44Z) - Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning [54.85397562961903]
Semi-supervised learning (SSL) has been proposed to leverage unlabeled data for training powerful models when only limited labeled data is available.
We address a more complex novel scenario named open-set SSL, where out-of-distribution (OOD) samples are contained in unlabeled data.
Our method achieves state-of-the-art results by successfully eliminating the effect of OOD samples.
arXiv Detail & Related papers (2020-07-22T10:33:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.