GOOD: Exploring Geometric Cues for Detecting Objects in an Open World
- URL: http://arxiv.org/abs/2212.11720v2
- Date: Sat, 24 Dec 2022 02:03:21 GMT
- Title: GOOD: Exploring Geometric Cues for Detecting Objects in an Open World
- Authors: Haiwen Huang, Andreas Geiger, Dan Zhang
- Abstract summary: State-of-the-art RGB-based models suffer from overfitting the training classes and often fail at detecting novel-looking objects.
We propose incorporating geometric cues such as depth and normals, predicted by general-purpose monocular estimators.
Our resulting Geometry-guided Open-world Object Detector (GOOD) significantly improves detection recall for novel object categories and already performs well with only a few training classes.
- Score: 33.25263418112558
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We address the task of open-world class-agnostic object detection, i.e.,
detecting every object in an image by learning from a limited number of base
object classes. State-of-the-art RGB-based models suffer from overfitting the
training classes and often fail at detecting novel-looking objects. This is
because RGB-based models primarily rely on appearance similarity to detect
novel objects and are also prone to overfitting short-cut cues such as textures
and discriminative parts. To address these shortcomings of RGB-based object
detectors, we propose incorporating geometric cues such as depth and normals,
predicted by general-purpose monocular estimators. Specifically, we use the
geometric cues to train an object proposal network for pseudo-labeling
unannotated novel objects in the training set. Our resulting Geometry-guided
Open-world Object Detector (GOOD) significantly improves detection recall for
novel object categories and already performs well with only a few training
classes. Using a single "person" class for training on the COCO dataset, GOOD
surpasses SOTA methods by 5.0% AR@100, a relative improvement of 24%.
Related papers
- Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation [58.37525311718006]
We put forth a novel formulation of the aerial object detection problem, namely open-vocabulary aerial object detection (OVAD)
We propose CastDet, a CLIP-activated student-teacher detection framework that serves as the first OVAD detector specifically designed for the challenging aerial scenario.
Our framework integrates a robust localization teacher along with several box selection strategies to generate high-quality proposals for novel objects.
arXiv Detail & Related papers (2024-11-04T12:59:13Z) - Improved Region Proposal Network for Enhanced Few-Shot Object Detection [23.871860648919593]
Few-shot object detection (FSOD) methods have emerged as a solution to the limitations of classic object detection approaches.
We develop a semi-supervised algorithm to detect and then utilize unlabeled novel objects as positive samples during the FSOD training stage.
Our improved hierarchical sampling strategy for the region proposal network (RPN) also boosts the perception of the object detection model for large objects.
arXiv Detail & Related papers (2023-08-15T02:35:59Z) - Identification of Novel Classes for Improving Few-Shot Object Detection [12.013345715187285]
Few-shot object detection (FSOD) methods offer a remedy by realizing robust object detection using only a few training samples per class.
We develop a semi-supervised algorithm to detect and then utilize unlabeled novel objects as positive samples during training to improve FSOD performance.
Our experimental results indicate that our method is effective and outperforms the existing state-of-the-art (SOTA) FSOD methods.
arXiv Detail & Related papers (2023-03-18T14:12:52Z) - Open World DETR: Transformer based Open World Object Detection [60.64535309016623]
We propose a two-stage training approach named Open World DETR for open world object detection based on Deformable DETR.
We fine-tune the class-specific components of the model with a multi-view self-labeling strategy and a consistency constraint.
Our proposed method outperforms other state-of-the-art open world object detection methods by a large margin.
arXiv Detail & Related papers (2022-12-06T13:39:30Z) - Spatial Reasoning for Few-Shot Object Detection [21.3564383157159]
We propose a spatial reasoning framework that detects novel objects with only a few training examples in a context.
We employ a graph convolutional network as the RoIs and their relatedness are defined as nodes and edges, respectively.
We demonstrate that the proposed method significantly outperforms the state-of-the-art methods and verify its efficacy through extensive ablation studies.
arXiv Detail & Related papers (2022-11-02T12:38:08Z) - Incremental-DETR: Incremental Few-Shot Object Detection via
Self-Supervised Learning [60.64535309016623]
We propose the Incremental-DETR that does incremental few-shot object detection via fine-tuning and self-supervised learning on the DETR object detector.
To alleviate severe over-fitting with few novel class data, we first fine-tune the class-specific components of DETR with self-supervision.
We further introduce a incremental few-shot fine-tuning strategy with knowledge distillation on the class-specific components of DETR to encourage the network in detecting novel classes without catastrophic forgetting.
arXiv Detail & Related papers (2022-05-09T05:08:08Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - Learning Open-World Object Proposals without Learning to Classify [110.30191531975804]
We propose a classification-free Object Localization Network (OLN) which estimates the objectness of each region purely by how well the location and shape of a region overlaps with any ground-truth object.
This simple strategy learns generalizable objectness and outperforms existing proposals on cross-category generalization.
arXiv Detail & Related papers (2021-08-15T14:36:02Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.