OW-DETR: Open-world Detection Transformer
- URL: http://arxiv.org/abs/2112.01513v1
- Date: Thu, 2 Dec 2021 18:58:30 GMT
- Title: OW-DETR: Open-world Detection Transformer
- Authors: Akshita Gupta, Sanath Narayan, K J Joseph, Salman Khan, Fahad Shahbaz
Khan, Mubarak Shah
- Abstract summary: We introduce a novel end-to-end transformer-based framework, OW-DETR, for open-world object detection.
OW-DETR comprises three dedicated components namely, attention-driven pseudo-labeling, novelty classification and objectness scoring.
Our model outperforms the recently introduced OWOD approach, ORE, with absolute gains ranging from 1.8% to 3.3% in terms of unknown recall.
- Score: 90.56239673123804
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Open-world object detection (OWOD) is a challenging computer vision problem,
where the task is to detect a known set of object categories while
simultaneously identifying unknown objects. Additionally, the model must
incrementally learn new classes that become known in the next training
episodes. Distinct from standard object detection, the OWOD setting poses
significant challenges for generating quality candidate proposals on
potentially unknown objects, separating the unknown objects from the background
and detecting diverse unknown objects. Here, we introduce a novel end-to-end
transformer-based framework, OW-DETR, for open-world object detection. The
proposed OW-DETR comprises three dedicated components namely, attention-driven
pseudo-labeling, novelty classification and objectness scoring to explicitly
address the aforementioned OWOD challenges. Our OW-DETR explicitly encodes
multi-scale contextual information, possesses less inductive bias, enables
knowledge transfer from known classes to the unknown class and can better
discriminate between unknown objects and background. Comprehensive experiments
are performed on two benchmarks: MS-COCO and PASCAL VOC. The extensive
ablations reveal the merits of our proposed contributions. Further, our model
outperforms the recently introduced OWOD approach, ORE, with absolute gains
ranging from 1.8% to 3.3% in terms of unknown recall on the MS-COCO benchmark.
In the case of incremental object detection, OW-DETR outperforms the
state-of-the-art for all settings on the PASCAL VOC benchmark. Our codes and
models will be publicly released.
Related papers
- Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation [58.37525311718006]
We put forth a novel formulation of the aerial object detection problem, namely open-vocabulary aerial object detection (OVAD)
We propose CastDet, a CLIP-activated student-teacher detection framework that serves as the first OVAD detector specifically designed for the challenging aerial scenario.
Our framework integrates a robust localization teacher along with several box selection strategies to generate high-quality proposals for novel objects.
arXiv Detail & Related papers (2024-11-04T12:59:13Z) - OSAD: Open-Set Aircraft Detection in SAR Images [1.1060425537315088]
Open-set detection aims to enable detectors trained on a closed set to detect all known objects and identify unknown objects in open-set environments.
To address these challenges, a novel open-set aircraft detector for SAR images is proposed, named Open-Set Aircraft Detection (OSAD)
It is equipped with three dedicated components: global context modeling (GCM), location quality-driven pseudo labeling generation (LPG), and prototype contrastive learning (PCL)
arXiv Detail & Related papers (2024-11-03T15:06:14Z) - Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection [101.15777242546649]
Open vocabulary object detection (OVD) aims at seeking an optimal object detector capable of recognizing objects from both base and novel categories.
Recent advances leverage knowledge distillation to transfer insightful knowledge from pre-trained large-scale vision-language models to the task of object detection.
We present a novel OVD framework termed LBP to propose learning background prompts to harness explored implicit background knowledge.
arXiv Detail & Related papers (2024-06-01T17:32:26Z) - Semi-supervised Open-World Object Detection [74.95267079505145]
We introduce a more realistic formulation, named semi-supervised open-world detection (SS-OWOD)
We demonstrate that the performance of the state-of-the-art OWOD detector dramatically deteriorates in the proposed SS-OWOD setting.
Our experiments on 4 datasets including MS COCO, PASCAL, Objects365 and DOTA demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-02-25T07:12:51Z) - Unsupervised Recognition of Unknown Objects for Open-World Object
Detection [28.787586991713535]
Open-World Object Detection (OWOD) extends object detection problem to a realistic and dynamic scenario.
Current OWOD models, such as ORE and OW-DETR, focus on pseudo-labeling regions with high objectness scores as unknowns.
This paper proposes a novel approach that learns an unsupervised discriminative model to recognize true unknown objects.
arXiv Detail & Related papers (2023-08-31T08:17:29Z) - Addressing the Challenges of Open-World Object Detection [12.053132866404972]
OW-RCNN is an open world object detector that addresses the three main challenges of open world object detection (OWOD)
OW-RCNN establishes a new state of the art using the open-world evaluation protocol on MS-COCO.
arXiv Detail & Related papers (2023-03-27T06:11:28Z) - Open World DETR: Transformer based Open World Object Detection [60.64535309016623]
We propose a two-stage training approach named Open World DETR for open world object detection based on Deformable DETR.
We fine-tune the class-specific components of the model with a multi-view self-labeling strategy and a consistency constraint.
Our proposed method outperforms other state-of-the-art open world object detection methods by a large margin.
arXiv Detail & Related papers (2022-12-06T13:39:30Z) - Incremental-DETR: Incremental Few-Shot Object Detection via
Self-Supervised Learning [60.64535309016623]
We propose the Incremental-DETR that does incremental few-shot object detection via fine-tuning and self-supervised learning on the DETR object detector.
To alleviate severe over-fitting with few novel class data, we first fine-tune the class-specific components of DETR with self-supervision.
We further introduce a incremental few-shot fine-tuning strategy with knowledge distillation on the class-specific components of DETR to encourage the network in detecting novel classes without catastrophic forgetting.
arXiv Detail & Related papers (2022-05-09T05:08:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.