Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
- URL: http://arxiv.org/abs/2103.01903v1
- Date: Tue, 2 Mar 2021 18:04:38 GMT
- Title: Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
- Authors: Chenchen Zhu, Fangyi Chen, Uzair Ahmed, Marios Savvides
- Abstract summary: Few-shot object detection is an imperative and long-lasting problem due to the inherent long-tail distribution of real-world data.
This work introduces explicit relation reasoning into the learning of novel object detection.
Experiments show that SRR-FSD can achieve competitive results at higher shots, and more importantly, a significantly better performance given both lower explicit and implicit shots.
- Score: 33.25064323136447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot object detection is an imperative and long-lasting problem due to
the inherent long-tail distribution of real-world data. Its performance is
largely affected by the data scarcity of novel classes. But the semantic
relation between the novel classes and the base classes is constant regardless
of the data availability. In this work, we investigate utilizing this semantic
relation together with the visual information and introduce explicit relation
reasoning into the learning of novel object detection. Specifically, we
represent each class concept by a semantic embedding learned from a large
corpus of text. The detector is trained to project the image representations of
objects into this embedding space. We also identify the problems of trivially
using the raw embeddings with a heuristic knowledge graph and propose to
augment the embeddings with a dynamic relation graph. As a result, our few-shot
detector, termed SRR-FSD, is robust and stable to the variation of shots of
novel objects. Experiments show that SRR-FSD can achieve competitive results at
higher shots, and more importantly, a significantly better performance given
both lower explicit and implicit shots. The proposed benchmark protocol with
implicit shots removed from the pretrained classification dataset can serve as
a more realistic setting for future research.
Related papers
- RelVAE: Generative Pretraining for few-shot Visual Relationship
Detection [2.2230760534775915]
We present the first pretraining method for few-shot predicate classification that does not require any annotated relations.
We construct few-shot training splits and show quantitative experiments on VG200 and VRD datasets.
arXiv Detail & Related papers (2023-11-27T19:08:08Z) - Unified Visual Relationship Detection with Vision and Language Models [89.77838890788638]
This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets.
We propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models.
Empirical results on both human-object interaction detection and scene-graph generation demonstrate the competitive performance of our model.
arXiv Detail & Related papers (2023-03-16T00:06:28Z) - Few-shot Object Detection with Refined Contrastive Learning [4.520231308678286]
We propose a novel few-shot object detection (FSOD) method with Refined Contrastive Learning (FSRC)
A pre-determination component is introduced to find out the Resemblance Group from novel classes which contains confusable classes.
RCL is pointedly performed on this group of classes in order to increase the inter-class distances among them.
arXiv Detail & Related papers (2022-11-24T09:34:20Z) - Spatial Reasoning for Few-Shot Object Detection [21.3564383157159]
We propose a spatial reasoning framework that detects novel objects with only a few training examples in a context.
We employ a graph convolutional network as the RoIs and their relatedness are defined as nodes and edges, respectively.
We demonstrate that the proposed method significantly outperforms the state-of-the-art methods and verify its efficacy through extensive ablation studies.
arXiv Detail & Related papers (2022-11-02T12:38:08Z) - Suspected Object Matters: Rethinking Model's Prediction for One-stage
Visual Grounding [93.82542533426766]
We propose a Suspected Object Transformation mechanism (SOT) to encourage the target object selection among the suspected ones.
SOT can be seamlessly integrated into existing CNN and Transformer-based one-stage visual grounders.
Extensive experiments demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2022-03-10T06:41:07Z) - Dynamic Relevance Learning for Few-Shot Object Detection [6.550840743803705]
We propose a dynamic relevance learning model, which utilizes the relationship between all support images and Region of Interest (RoI) on the query images to construct a dynamic graph convolutional network (GCN)
The proposed model achieves the best overall performance, which shows its effectiveness of learning more generalized features.
arXiv Detail & Related papers (2021-08-04T18:29:42Z) - Dense Relation Distillation with Context-aware Aggregation for Few-Shot
Object Detection [18.04185751827619]
Few-shot object detection is challenging since the fine-grained feature of novel object can be easily overlooked with only a few data available.
We propose Dense Relation Distillation with Context-aware Aggregation (DCNet) to tackle the few-shot detection problem.
arXiv Detail & Related papers (2021-03-30T05:34:49Z) - Few-shot Weakly-Supervised Object Detection via Directional Statistics [55.97230224399744]
We propose a probabilistic multiple instance learning approach for few-shot Common Object Localization (COL) and few-shot Weakly Supervised Object Detection (WSOD)
Our model simultaneously learns the distribution of the novel objects and localizes them via expectation-maximization steps.
Our experiments show that the proposed method, despite being simple, outperforms strong baselines in few-shot COL and WSOD, as well as large-scale WSOD tasks.
arXiv Detail & Related papers (2021-03-25T22:34:16Z) - Any-Shot Object Detection [81.88153407655334]
'Any-shot detection' is where totally unseen and few-shot categories can simultaneously co-occur during inference.
We propose a unified any-shot detection model, that can concurrently learn to detect both zero-shot and few-shot object classes.
Our framework can also be used solely for Zero-shot detection and Few-shot detection tasks.
arXiv Detail & Related papers (2020-03-16T03:43:15Z) - Learning to Compare Relation: Semantic Alignment for Few-Shot Learning [48.463122399494175]
We present a novel semantic alignment model to compare relations, which is robust to content misalignment.
We conduct extensive experiments on several few-shot learning datasets.
arXiv Detail & Related papers (2020-02-29T08:37:02Z) - Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim.
We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting.
Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.