FA: Forced Prompt Learning of Vision-Language Models for Out-of-Distribution Detection
- URL: http://arxiv.org/abs/2507.04511v2
- Date: Tue, 08 Jul 2025 14:45:48 GMT
- Title: FA: Forced Prompt Learning of Vision-Language Models for Out-of-Distribution Detection
- Authors: Xinhua Lu, Runhe Lai, Yanqi Wu, Kanghao Chen, Wei-Shi Zheng, Ruixuan Wang,
- Abstract summary: We propose an innovative CLIP-based framework based on Forced prompt leArning (FA) to make full use of the In-Distribution (ID) knowledge.<n>FA is capable of achieving notable improvements in OOD detection, even when trained without any external auxiliary datasets.
- Score: 25.015218537268115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained vision-language models (VLMs) have advanced out-of-distribution (OOD) detection recently. However, existing CLIP-based methods often focus on learning OOD-related knowledge to improve OOD detection, showing limited generalization or reliance on external large-scale auxiliary datasets. In this study, instead of delving into the intricate OOD-related knowledge, we propose an innovative CLIP-based framework based on Forced prompt leArning (FA), designed to make full use of the In-Distribution (ID) knowledge and ultimately boost the effectiveness of OOD detection. Our key insight is to learn a prompt (i.e., forced prompt) that contains more diversified and richer descriptions of the ID classes beyond the textual semantics of class labels. Specifically, it promotes better discernment for ID images, by forcing more notable semantic similarity between ID images and the learnable forced prompt. Moreover, we introduce a forced coefficient, encouraging the forced prompt to learn more comprehensive and nuanced descriptions of the ID classes. In this way, FA is capable of achieving notable improvements in OOD detection, even when trained without any external auxiliary datasets, while maintaining an identical number of trainable parameters as CoOp. Extensive empirical evaluations confirm our method consistently outperforms current state-of-the-art methods. Code is available at https://github.com/0xFAFA/FA.
Related papers
- Knowledge Regularized Negative Feature Tuning of Vision-Language Models for Out-of-Distribution Detection [54.433899174017185]
Out-of-distribution (OOD) detection is crucial for building reliable machine learning models.<n>We propose a novel method called Knowledge Regularized Negative Feature Tuning (KR-NFT)<n>NFT applies distribution-aware transformations to pre-trained text features, effectively separating positive and negative features into distinct spaces.<n>When trained with few-shot samples from ImageNet dataset, KR-NFT not only improves ID classification accuracy and OOD detection but also significantly reduces the FPR95 by 5.44%.
arXiv Detail & Related papers (2025-07-26T07:44:04Z) - TagFog: Textual Anchor Guidance and Fake Outlier Generation for Visual Out-of-Distribution Detection [34.31570050254269]
Out-of-distribution (OOD) detection is crucial in many real-world applications.<n>We propose a new learning framework which leverage simple Jigsaw-based fake OOD data and rich semantic embeddings (anchors') from the ChatGPT description of ID knowledge to help guide the training of the image encoder.
arXiv Detail & Related papers (2024-11-22T14:40:25Z) - What If the Input is Expanded in OOD Detection? [77.37433624869857]
Out-of-distribution (OOD) detection aims to identify OOD inputs from unknown classes.
Various scoring functions are proposed to distinguish it from in-distribution (ID) data.
We introduce a novel perspective, i.e., employing different common corruptions on the input space.
arXiv Detail & Related papers (2024-10-24T06:47:28Z) - Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection [71.93411099797308]
Out-of-distribution (OOD) samples are crucial when deploying machine learning models in open-world scenarios.
We propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to potential Outlier Exposure, termed EOE.
EOE can be generalized to different tasks, including far, near, and fine-language OOD detection.
EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset.
arXiv Detail & Related papers (2024-06-02T17:09:48Z) - Out-of-Distribution Detection Using Peer-Class Generated by Large Language Model [0.0]
Out-of-distribution (OOD) detection is a critical task to ensure the reliability and security of machine learning models.
In this paper, a novel method called ODPC is proposed, in which specific prompts to generate OOD peer classes of ID semantics are designed by a large language model.
Experiments on five benchmark datasets show that the method we propose can yield state-of-the-art results.
arXiv Detail & Related papers (2024-03-20T06:04:05Z) - Exploring Large Language Models for Multi-Modal Out-of-Distribution
Detection [67.68030805755679]
Large language models (LLMs) encode a wealth of world knowledge and can be prompted to generate descriptive features for each class.
In this paper, we propose to apply world knowledge to enhance OOD detection performance through selective generation from LLMs.
arXiv Detail & Related papers (2023-10-12T04:14:28Z) - Class Relevance Learning For Out-of-distribution Detection [16.029229052068]
This paper presents an innovative class relevance learning method tailored for OOD detection.
Our method establishes a comprehensive class relevance learning framework, strategically harnessing interclass relationships within the OOD pipeline.
arXiv Detail & Related papers (2023-09-21T08:38:21Z) - From Global to Local: Multi-scale Out-of-distribution Detection [129.37607313927458]
Out-of-distribution (OOD) detection aims to detect "unknown" data whose labels have not been seen during the in-distribution (ID) training process.
Recent progress in representation learning gives rise to distance-based OOD detection.
We propose Multi-scale OOD DEtection (MODE), a first framework leveraging both global visual information and local region details.
arXiv Detail & Related papers (2023-08-20T11:56:25Z) - LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning [37.36999826208225]
We present a novel vision-language prompt learning approach for few-shot out-of-distribution (OOD) detection.
LoCoOp performs OOD regularization that utilizes the portions of CLIP local features as OOD features during training.
LoCoOp outperforms existing zero-shot and fully supervised detection methods.
arXiv Detail & Related papers (2023-06-02T06:33:08Z) - UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge.
With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.