VisTa: Visual-contextual and Text-augmented Zero-shot Object-level OOD Detection
- URL: http://arxiv.org/abs/2503.22291v1
- Date: Fri, 28 Mar 2025 10:08:17 GMT
- Title: VisTa: Visual-contextual and Text-augmented Zero-shot Object-level OOD Detection
- Authors: Bin Zhang, Xiaoyang Qu, Guokuan Li, Jiguang Wan, Jianzong Wang,
- Abstract summary: We introduce a new method to adapt CLIP for zero-shot object-level OOD detection.<n>Our method preserves critical contextual information and improves the ability to differentiate between ID and OOD objects.
- Score: 22.200900846112805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As object detectors are increasingly deployed as black-box cloud services or pre-trained models with restricted access to the original training data, the challenge of zero-shot object-level out-of-distribution (OOD) detection arises. This task becomes crucial in ensuring the reliability of detectors in open-world settings. While existing methods have demonstrated success in image-level OOD detection using pre-trained vision-language models like CLIP, directly applying such models to object-level OOD detection presents challenges due to the loss of contextual information and reliance on image-level alignment. To tackle these challenges, we introduce a new method that leverages visual prompts and text-augmented in-distribution (ID) space construction to adapt CLIP for zero-shot object-level OOD detection. Our method preserves critical contextual information and improves the ability to differentiate between ID and OOD objects, achieving competitive performance across different benchmarks.
Related papers
- Dream-Box: Object-wise Outlier Generation for Out-of-Distribution Detection [15.806236012151968]
Out-of-distribution (OOD) detection is a challenging task that has received significant attention in recent years.
Recent work has focused on generating synthetic outliers and using them to train an outlier detector.
We introduce Dream-Box, a method that provides a link to object-wise outlier generation in the pixel space for OOD detection.
arXiv Detail & Related papers (2025-04-25T23:52:27Z) - RUNA: Object-level Out-of-Distribution Detection via Regional Uncertainty Alignment of Multimodal Representations [33.971901643313856]
RUNA is a novel framework for detecting out-of-distribution (OOD) objects.<n>It employs a regional uncertainty alignment mechanism to distinguish ID from OOD objects effectively.<n>Our experiments show that RUNA substantially surpasses state-of-the-art methods in object-level OOD detection.
arXiv Detail & Related papers (2025-03-28T10:01:55Z) - Can OOD Object Detectors Learn from Foundation Models? [56.03404530594071]
Out-of-distribution (OOD) object detection is a challenging task due to the absence of open-set OOD data.
Inspired by recent advancements in text-to-image generative models, we study the potential of generative models trained on large-scale open-set data to synthesize OOD samples.
We introduce SyncOOD, a simple data curation method that capitalizes on the capabilities of large foundation models.
arXiv Detail & Related papers (2024-09-08T17:28:22Z) - TagOOD: A Novel Approach to Out-of-Distribution Detection via Vision-Language Representations and Class Center Learning [26.446233594630087]
We propose textbfTagOOD, a novel approach for OOD detection using vision-language representations.
TagOOD trains a lightweight network on the extracted object features to learn representative class centers.
These centers capture the central tendencies of IND object classes, minimizing the influence of irrelevant image features during OOD detection.
arXiv Detail & Related papers (2024-08-28T06:37:59Z) - Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection [71.93411099797308]
Out-of-distribution (OOD) samples are crucial when deploying machine learning models in open-world scenarios.
We propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to potential Outlier Exposure, termed EOE.
EOE can be generalized to different tasks, including far, near, and fine-language OOD detection.
EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset.
arXiv Detail & Related papers (2024-06-02T17:09:48Z) - Simple Image-level Classification Improves Open-vocabulary Object
Detection [27.131298903486474]
Open-Vocabulary Object Detection (OVOD) aims to detect novel objects beyond a given set of base categories on which the detection model is trained.
Recent OVOD methods focus on adapting the image-level pre-trained vision-language models (VLMs), such as CLIP, to a region-level object detection task via, eg., region-level knowledge distillation, regional prompt learning, or region-text pre-training.
These methods have demonstrated remarkable performance in recognizing regional visual concepts, but they are weak in exploiting the VLMs' powerful global scene understanding ability learned from the billion-scale
arXiv Detail & Related papers (2023-12-16T13:06:15Z) - Exploring Large Language Models for Multi-Modal Out-of-Distribution
Detection [67.68030805755679]
Large language models (LLMs) encode a wealth of world knowledge and can be prompted to generate descriptive features for each class.
In this paper, we propose to apply world knowledge to enhance OOD detection performance through selective generation from LLMs.
arXiv Detail & Related papers (2023-10-12T04:14:28Z) - From Global to Local: Multi-scale Out-of-distribution Detection [129.37607313927458]
Out-of-distribution (OOD) detection aims to detect "unknown" data whose labels have not been seen during the in-distribution (ID) training process.
Recent progress in representation learning gives rise to distance-based OOD detection.
We propose Multi-scale OOD DEtection (MODE), a first framework leveraging both global visual information and local region details.
arXiv Detail & Related papers (2023-08-20T11:56:25Z) - Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection
Capability [70.72426887518517]
Out-of-distribution (OOD) detection is an indispensable aspect of secure AI when deploying machine learning models in real-world applications.
We propose a novel method, Unleashing Mask, which aims to restore the OOD discriminative capabilities of the well-trained model with ID data.
Our method utilizes a mask to figure out the memorized atypical samples, and then finetune the model or prune it with the introduced mask to forget them.
arXiv Detail & Related papers (2023-06-06T14:23:34Z) - Building One-class Detector for Anything: Open-vocabulary Zero-shot OOD
Detection Using Text-image Models [23.302018871162186]
We propose a novel one-class open-set OOD detector that leverages text-image pre-trained models in a zero-shot fashion.
Our approach is designed to detect anything not in-domain and offers the flexibility to detect a wide variety of OOD.
Our method shows superior performance over previous methods on all benchmarks.
arXiv Detail & Related papers (2023-05-26T18:58:56Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.