Enhancing Novel Object Detection via Cooperative Foundational Models
- URL: http://arxiv.org/abs/2311.12068v2
- Date: Wed, 22 Nov 2023 04:13:38 GMT
- Title: Enhancing Novel Object Detection via Cooperative Foundational Models
- Authors: Rohit Bharadwaj, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan
- Abstract summary: We present a novel approach to transform existing closed-set detectors into open-set detectors.
We surpass the current state-of-the-art by a margin of 7.2 $ textAP_50 $ for novel classes.
- Score: 75.30243629533277
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we address the challenging and emergent problem of novel object
detection (NOD), focusing on the accurate detection of both known and novel
object categories during inference. Traditional object detection algorithms are
inherently closed-set, limiting their capability to handle NOD. We present a
novel approach to transform existing closed-set detectors into open-set
detectors. This transformation is achieved by leveraging the complementary
strengths of pre-trained foundational models, specifically CLIP and SAM,
through our cooperative mechanism. Furthermore, by integrating this mechanism
with state-of-the-art open-set detectors such as GDINO, we establish new
benchmarks in object detection performance. Our method achieves 17.42 mAP in
novel object detection and 42.08 mAP for known objects on the challenging LVIS
dataset. Adapting our approach to the COCO OVD split, we surpass the current
state-of-the-art by a margin of 7.2 $ \text{AP}_{50} $ for novel classes. Our
code is available at
https://github.com/rohit901/cooperative-foundational-models .
Related papers
- On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data [6.7236795813629]
We propose a novel detection algorithm for detecting unknown objects in image data.
It exploits supervised dimensionality reduction techniques to mitigate the effects of the curse of dimensionality on the features extracted by the model.
It utilizes high-resolution feature maps to identify potential unknown objects in an unsupervised fashion.
arXiv Detail & Related papers (2024-11-07T10:15:25Z) - Open-World Object Detection with Instance Representation Learning [1.8749305679160366]
We propose a method to train an object detector that can both detect novel objects and extract semantically rich features in open-world conditions.
Our method learns a robust and generalizable feature space, outperforming other OWOD-based feature extraction methods.
arXiv Detail & Related papers (2024-09-24T13:13:34Z) - DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM [81.75988648572347]
We present DetToolChain, a novel prompting paradigm to unleash the zero-shot object detection ability of multimodal large language models (MLLMs)
Our approach consists of a detection prompting toolkit inspired by high-precision detection priors and a new Chain-of-Thought to implement these prompts.
We show that GPT-4V with our DetToolChain improves state-of-the-art object detectors by +21.5% AP50 on MS Novel class set for open-vocabulary detection.
arXiv Detail & Related papers (2024-03-19T06:54:33Z) - Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector [72.05791402494727]
This paper studies the challenging cross-domain few-shot object detection (CD-FSOD)
It aims to develop an accurate object detector for novel domains with minimal labeled examples.
arXiv Detail & Related papers (2024-02-05T15:25:32Z) - Open World Object Detection in the Era of Foundation Models [53.683963161370585]
We introduce a new benchmark that includes five real-world application-driven datasets.
We introduce a novel method, Foundation Object detection Model for the Open world, or FOMO, which identifies unknown objects based on their shared attributes with the base known objects.
arXiv Detail & Related papers (2023-12-10T03:56:06Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - CAT: LoCalization and IdentificAtion Cascade Detection Transformer for
Open-World Object Detection [17.766859354014663]
Open-world object detection requires a model trained from data on known objects to detect both known and unknown objects.
We propose a novel solution called CAT: LoCalization and IdentificAtion Cascade Detection Transformer.
We show that our model outperforms the state-of-the-art in terms of all metrics in the task of OWOD, incremental object detection (IOD) and open-set detection.
arXiv Detail & Related papers (2023-01-05T09:11:16Z) - Towards Open-Set Object Detection and Discovery [38.81806249664884]
We present a new task, namely Open-Set Object Detection and Discovery (OSODD)
We propose a two-stage method that first uses an open-set object detector to predict both known and unknown objects.
Then, we study the representation of predicted objects in an unsupervised manner and discover new categories from the set of unknown objects.
arXiv Detail & Related papers (2022-04-12T08:07:01Z) - Multi-View Correlation Distillation for Incremental Object Detection [12.536640582318949]
We propose a novel textbfMulti-textbfView textbfCorrelation textbfDistillation (MVCD) based incremental object detection method.
arXiv Detail & Related papers (2021-07-05T04:36:33Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.