Spatial Reasoning for Few-Shot Object Detection
- URL: http://arxiv.org/abs/2211.01080v1
- Date: Wed, 2 Nov 2022 12:38:08 GMT
- Title: Spatial Reasoning for Few-Shot Object Detection
- Authors: Geonuk Kim, Hong-Gyu Jung, Seong-Whan Lee
- Abstract summary: We propose a spatial reasoning framework that detects novel objects with only a few training examples in a context.
We employ a graph convolutional network as the RoIs and their relatedness are defined as nodes and edges, respectively.
We demonstrate that the proposed method significantly outperforms the state-of-the-art methods and verify its efficacy through extensive ablation studies.
- Score: 21.3564383157159
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although modern object detectors rely heavily on a significant amount of
training data, humans can easily detect novel objects using a few training
examples. The mechanism of the human visual system is to interpret spatial
relationships among various objects and this process enables us to exploit
contextual information by considering the co-occurrence of objects. Thus, we
propose a spatial reasoning framework that detects novel objects with only a
few training examples in a context. We infer geometric relatedness between
novel and base RoIs (Region-of-Interests) to enhance the feature representation
of novel categories using an object detector well trained on base categories.
We employ a graph convolutional network as the RoIs and their relatedness are
defined as nodes and edges, respectively. Furthermore, we present spatial data
augmentation to overcome the few-shot environment where all objects and
bounding boxes in an image are resized randomly. Using the PASCAL VOC and MS
COCO datasets, we demonstrate that the proposed method significantly
outperforms the state-of-the-art methods and verify its efficacy through
extensive ablation studies.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Local Feature Matching Using Deep Learning: A Survey [19.322545965903608]
Local feature matching enjoys wide-ranging applications in the realm of computer vision, encompassing domains such as image retrieval, 3D reconstruction, and object recognition.
In recent years, the introduction of deep learning models has sparked widespread exploration into local feature matching techniques.
The paper also explores the practical application of local feature matching in diverse domains such as Structure from Motion, Remote Sensing Image Registration, and Medical Image Registration.
arXiv Detail & Related papers (2024-01-31T04:32:41Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - GOOD: Exploring Geometric Cues for Detecting Objects in an Open World [33.25263418112558]
State-of-the-art RGB-based models suffer from overfitting the training classes and often fail at detecting novel-looking objects.
We propose incorporating geometric cues such as depth and normals, predicted by general-purpose monocular estimators.
Our resulting Geometry-guided Open-world Object Detector (GOOD) significantly improves detection recall for novel object categories and already performs well with only a few training classes.
arXiv Detail & Related papers (2022-12-22T14:13:33Z) - Learning Open-World Object Proposals without Learning to Classify [110.30191531975804]
We propose a classification-free Object Localization Network (OLN) which estimates the objectness of each region purely by how well the location and shape of a region overlaps with any ground-truth object.
This simple strategy learns generalizable objectness and outperforms existing proposals on cross-category generalization.
arXiv Detail & Related papers (2021-08-15T14:36:02Z) - Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with
Attentive Feature Alignment [33.446875089255876]
Few-shot object detection (FSOD) aims to detect objects using only few examples.
We propose a meta-learning based few-shot object detection method by transferring meta-knowledge learned from data-abundant base classes to data-scarce novel classes.
arXiv Detail & Related papers (2021-04-15T19:01:27Z) - Few-shot Weakly-Supervised Object Detection via Directional Statistics [55.97230224399744]
We propose a probabilistic multiple instance learning approach for few-shot Common Object Localization (COL) and few-shot Weakly Supervised Object Detection (WSOD)
Our model simultaneously learns the distribution of the novel objects and localizes them via expectation-maximization steps.
Our experiments show that the proposed method, despite being simple, outperforms strong baselines in few-shot COL and WSOD, as well as large-scale WSOD tasks.
arXiv Detail & Related papers (2021-03-25T22:34:16Z) - Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection [33.25064323136447]
Few-shot object detection is an imperative and long-lasting problem due to the inherent long-tail distribution of real-world data.
This work introduces explicit relation reasoning into the learning of novel object detection.
Experiments show that SRR-FSD can achieve competitive results at higher shots, and more importantly, a significantly better performance given both lower explicit and implicit shots.
arXiv Detail & Related papers (2021-03-02T18:04:38Z) - Context Decoupling Augmentation for Weakly Supervised Semantic
Segmentation [53.49821324597837]
Weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years.
We present a Context Decoupling Augmentation ( CDA) method to change the inherent context in which the objects appear.
To validate the effectiveness of the proposed method, extensive experiments on PASCAL VOC 2012 dataset with several alternative network architectures demonstrate that CDA can boost various popular WSSS methods to the new state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-03-02T15:05:09Z) - Few-Shot Object Detection via Knowledge Transfer [21.3564383157159]
Conventional methods for object detection usually require substantial amounts of training data and annotated bounding boxes.
In this paper, we introduce a few-shot object detection via knowledge transfer, which aims to detect objects from a few training examples.
arXiv Detail & Related papers (2020-08-28T06:35:27Z) - Exploring Bottom-up and Top-down Cues with Attentive Learning for Webly
Supervised Object Detection [76.9756607002489]
We propose a novel webly supervised object detection (WebSOD) method for novel classes.
Our proposed method combines bottom-up and top-down cues for novel class detection.
We demonstrate our proposed method on PASCAL VOC dataset with three different novel/base splits.
arXiv Detail & Related papers (2020-03-22T03:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.