Recent Advances in Out-of-Distribution Detection with CLIP-Like Models: A Survey
- URL: http://arxiv.org/abs/2505.02448v1
- Date: Mon, 05 May 2025 08:22:38 GMT
- Title: Recent Advances in Out-of-Distribution Detection with CLIP-Like Models: A Survey
- Authors: Chaohua Li, Enhao Zhang, Chuanxing Geng, Songcan Chen,
- Abstract summary: Out-of-distribution detection (OOD) is a pivotal task for real-world applications that trains models to identify samples that are distributionally different from the in-distribution (ID) data during testing.<n>Recent advances in AI, particularly Vision-Language Models (VLMs) like CLIP, have revolutionized OOD detection by shifting from traditional unimodal image detectors to multimodal image-text detectors.<n>To better align with CLIP's cross-modal nature, we propose a new categorization framework rooted in both image and text modalities.
- Score: 27.467732819969935
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Out-of-distribution detection (OOD) is a pivotal task for real-world applications that trains models to identify samples that are distributionally different from the in-distribution (ID) data during testing. Recent advances in AI, particularly Vision-Language Models (VLMs) like CLIP, have revolutionized OOD detection by shifting from traditional unimodal image detectors to multimodal image-text detectors. This shift has inspired extensive research; however, existing categorization schemes (e.g., few- or zero-shot types) still rely solely on the availability of ID images, adhering to a unimodal paradigm. To better align with CLIP's cross-modal nature, we propose a new categorization framework rooted in both image and text modalities. Specifically, we categorize existing methods based on how visual and textual information of OOD data is utilized within image + text modalities, and further divide them into four groups: OOD Images (i.e., outliers) Seen or Unseen, and OOD Texts (i.e., learnable vectors or class names) Known or Unknown, across two training strategies (i.e., train-free or training-required). More importantly, we discuss open problems in CLIP-like OOD detection and highlight promising directions for future research, including cross-domain integration, practical applications, and theoretical understanding.
Related papers
- Mind the Way You Select Negative Texts: Pursuing the Distance Consistency in OOD Detection with VLMs [80.03370593724422]
Out-of-distribution (OOD) detection seeks to identify samples from unknown classes.<n>Current methods often incorporate intra-modal distance during OOD detection, such as comparing negative texts with ID labels.<n>We propose InterNeg, a framework that systematically utilizes consistent inter-modal distance enhancement from textual and visual perspectives.
arXiv Detail & Related papers (2026-03-03T05:44:47Z) - Vision Also You Need: Navigating Out-of-Distribution Detection with Multimodal Large Language Model [42.29540047339044]
Out-of-Distribution (OOD) detection is a critical task that has garnered significant attention.<n>We propose a novel pipeline, MM-OOD, which leverages the multimodal reasoning capabilities of MLLMs.<n>Our method is designed to improve performance in both near and far OOD tasks.
arXiv Detail & Related papers (2026-01-20T15:06:10Z) - FodFoM: Fake Outlier Data by Foundation Models Creates Stronger Visual Out-of-Distribution Detector [25.224930928724326]
Out-of-Distribution (OOD) detection is crucial when deploying machine learning models in open-world applications.<n>We propose a novel OOD detection framework FodFoM that innovatively combines multiple foundation models to generate two types of challenging fake outlier images.<n>New state-of-the-art OOD detection performance is achieved on multiple benchmarks.
arXiv Detail & Related papers (2024-11-22T17:29:52Z) - Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection [71.93411099797308]
Out-of-distribution (OOD) samples are crucial when deploying machine learning models in open-world scenarios.
We propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to potential Outlier Exposure, termed EOE.
EOE can be generalized to different tasks, including far, near, and fine-language OOD detection.
EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset.
arXiv Detail & Related papers (2024-06-02T17:09:48Z) - Negative Label Guided OOD Detection with Pretrained Vision-Language Models [96.67087734472912]
Out-of-distribution (OOD) detection aims at identifying samples from unknown classes.
We propose a novel post hoc OOD detection method, called NegLabel, which takes a vast number of negative labels from extensive corpus databases.
arXiv Detail & Related papers (2024-03-29T09:19:52Z) - Out-of-Distribution Detection Using Peer-Class Generated by Large Language Model [0.0]
Out-of-distribution (OOD) detection is a critical task to ensure the reliability and security of machine learning models.
In this paper, a novel method called ODPC is proposed, in which specific prompts to generate OOD peer classes of ID semantics are designed by a large language model.
Experiments on five benchmark datasets show that the method we propose can yield state-of-the-art results.
arXiv Detail & Related papers (2024-03-20T06:04:05Z) - Exploring Large Language Models for Multi-Modal Out-of-Distribution
Detection [67.68030805755679]
Large language models (LLMs) encode a wealth of world knowledge and can be prompted to generate descriptive features for each class.
In this paper, we propose to apply world knowledge to enhance OOD detection performance through selective generation from LLMs.
arXiv Detail & Related papers (2023-10-12T04:14:28Z) - Class Relevance Learning For Out-of-distribution Detection [16.029229052068]
This paper presents an innovative class relevance learning method tailored for OOD detection.
Our method establishes a comprehensive class relevance learning framework, strategically harnessing interclass relationships within the OOD pipeline.
arXiv Detail & Related papers (2023-09-21T08:38:21Z) - From Global to Local: Multi-scale Out-of-distribution Detection [129.37607313927458]
Out-of-distribution (OOD) detection aims to detect "unknown" data whose labels have not been seen during the in-distribution (ID) training process.
Recent progress in representation learning gives rise to distance-based OOD detection.
We propose Multi-scale OOD DEtection (MODE), a first framework leveraging both global visual information and local region details.
arXiv Detail & Related papers (2023-08-20T11:56:25Z) - General-Purpose Multi-Modal OOD Detection Framework [5.287829685181842]
Out-of-distribution (OOD) detection identifies test samples that differ from the training data, which is critical to ensuring the safety and reliability of machine learning (ML) systems.
We propose a general-purpose weakly-supervised OOD detection framework, called WOOD, that combines a binary classifier and a contrastive learning component.
We evaluate the proposed WOOD model on multiple real-world datasets, and the experimental results demonstrate that the WOOD model outperforms the state-of-the-art methods for multi-modal OOD detection.
arXiv Detail & Related papers (2023-07-24T18:50:49Z) - Multi-Modal Few-Shot Object Detection with Meta-Learning-Based
Cross-Modal Prompting [77.69172089359606]
We study multi-modal few-shot object detection (FSOD) in this paper, using both few-shot visual examples and class semantic information for detection.
Our approach is motivated by the high-level conceptual similarity of (metric-based) meta-learning and prompt-based learning.
We comprehensively evaluate the proposed multi-modal FSOD models on multiple few-shot object detection benchmarks, achieving promising results.
arXiv Detail & Related papers (2022-04-16T16:45:06Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z) - OODformer: Out-Of-Distribution Detection Transformer [15.17006322500865]
In real-world safety-critical applications, it is important to be aware if a new data point is OOD.
This paper proposes a first-of-its-kind OOD detection architecture named OODformer.
arXiv Detail & Related papers (2021-07-19T15:46:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.