Harnessing Large Language and Vision-Language Models for Robust Out-of-Distribution Detection
- URL: http://arxiv.org/abs/2501.05228v1
- Date: Thu, 09 Jan 2025 13:36:37 GMT
- Title: Harnessing Large Language and Vision-Language Models for Robust Out-of-Distribution Detection
- Authors: Pei-Kang Lee, Jun-Cheng Chen, Ja-Ling Wu,
- Abstract summary: Out-of-distribution (OOD) detection has seen significant advancements with zero-shot approaches.
We propose a novel strategy to enhance zero-shot OOD detection performances for both Far-OOD and Near-OOD scenarios.
We introduce novel few-shot prompt tuning and visual prompt tuning to adapt the proposed framework to better align with the target distribution.
- Score: 11.277049921075026
- License:
- Abstract: Out-of-distribution (OOD) detection has seen significant advancements with zero-shot approaches by leveraging the powerful Vision-Language Models (VLMs) such as CLIP. However, prior research works have predominantly focused on enhancing Far-OOD performance, while potentially compromising Near-OOD efficacy, as observed from our pilot study. To address this issue, we propose a novel strategy to enhance zero-shot OOD detection performances for both Far-OOD and Near-OOD scenarios by innovatively harnessing Large Language Models (LLMs) and VLMs. Our approach first exploit an LLM to generate superclasses of the ID labels and their corresponding background descriptions followed by feature extraction using CLIP. We then isolate the core semantic features for ID data by subtracting background features from the superclass features. The refined representation facilitates the selection of more appropriate negative labels for OOD data from a comprehensive candidate label set of WordNet, thereby enhancing the performance of zero-shot OOD detection in both scenarios. Furthermore, we introduce novel few-shot prompt tuning and visual prompt tuning to adapt the proposed framework to better align with the target distribution. Experimental results demonstrate that the proposed approach consistently outperforms current state-of-the-art methods across multiple benchmarks, with an improvement of up to 2.9% in AUROC and a reduction of up to 12.6% in FPR95. Additionally, our method exhibits superior robustness against covariate shift across different domains, further highlighting its effectiveness in real-world scenarios.
Related papers
- SeTAR: Out-of-Distribution Detection with Selective Low-Rank Approximation [5.590633742488972]
Out-of-distribution (OOD) detection is crucial for the safe deployment of neural networks.
We propose SeTAR, a training-free OOD detection method.
SeTAR enhances OOD detection via post-hoc modification of the model's weight matrices using a simple greedy search algorithm.
Our work offers a scalable, efficient solution for OOD detection, setting a new state-of-the-art in this area.
arXiv Detail & Related papers (2024-06-18T13:55:13Z) - Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection [71.93411099797308]
Out-of-distribution (OOD) samples are crucial when deploying machine learning models in open-world scenarios.
We propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to potential Outlier Exposure, termed EOE.
EOE can be generalized to different tasks, including far, near, and fine-language OOD detection.
EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset.
arXiv Detail & Related papers (2024-06-02T17:09:48Z) - WeiPer: OOD Detection using Weight Perturbations of Class Projections [11.130659240045544]
We introduce perturbations of the class projections in the final fully connected layer which creates a richer representation of the input.
We achieve state-of-the-art OOD detection results across multiple benchmarks of the OpenOOD framework.
arXiv Detail & Related papers (2024-05-27T13:38:28Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models.
We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data.
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models? [29.75562085178755]
We study how fine-tuning impact OOD detection for few-shot downstream tasks.
Our results suggest that a proper choice of OOD scores is essential for CLIP-based fine-tuning.
We also show that prompt learning demonstrates the state-of-the-art OOD detection performance over the zero-shot counterpart.
arXiv Detail & Related papers (2023-06-09T17:16:50Z) - Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis,
and LLMs Evaluations [111.88727295707454]
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of NLP.
We propose a benchmark construction protocol that ensures clear differentiation and challenging distribution shifts.
We conduct experiments on pre-trained language models for analysis and evaluation of OOD robustness.
arXiv Detail & Related papers (2023-06-07T17:47:03Z) - Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is
All You Need [52.88953913542445]
We find surprisingly that simply using reconstruction-based methods could boost the performance of OOD detection significantly.
We take Masked Image Modeling as a pretext task for our OOD detection framework (MOOD)
arXiv Detail & Related papers (2023-02-06T08:24:41Z) - Holistic Sentence Embeddings for Better Out-of-Distribution Detection [12.640837452980332]
We propose a simple embedding approach named Avg-Avg, which averages all token representations from each intermediate layer as the sentence embedding.
Our analysis demonstrates that it indeed helps preserve general linguistic knowledge in fine-tuned PLMs and substantially benefits detecting background shifts.
arXiv Detail & Related papers (2022-10-14T03:22:58Z) - ReAct: Out-of-distribution Detection With Rectified Activations [20.792140933660075]
Out-of-distribution (OOD) detection has received much attention lately due to its practical importance.
One of the primary challenges is that models often produce highly confident predictions on OOD data.
We propose ReAct--a simple and effective technique for reducing model overconfidence on OOD data.
arXiv Detail & Related papers (2021-11-24T21:02:07Z) - Enhancing the Generalization for Intent Classification and Out-of-Domain
Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU)
Recent works have shown that using extra data and labels can improve the OOD detection performance.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z) - Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs.
We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.