DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection
- URL: http://arxiv.org/abs/2108.09017v1
- Date: Fri, 20 Aug 2021 06:12:55 GMT
- Title: DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection
- Authors: Limeng Qiao, Yuxuan Zhao, Zhiyuan Li, Xi Qiu, Jianan Wu and Chi Zhang
- Abstract summary: Few-shot object detection aims at detecting novel objects rapidly from extremely few examples of previously unseen classes.
Most existing approaches employ the Faster R-CNN as basic detection framework.
We propose a simple yet effective architecture named Decoupled Faster R-CNN (DeFRCN)
- Score: 17.326702469604676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot object detection, which aims at detecting novel objects rapidly from
extremely few annotated examples of previously unseen classes, has attracted
significant research interest in the community. Most existing approaches employ
the Faster R-CNN as basic detection framework, yet, due to the lack of tailored
considerations for data-scarce scenario, their performance is often not
satisfactory. In this paper, we look closely into the conventional Faster R-CNN
and analyze its contradictions from two orthogonal perspectives, namely
multi-stage (RPN vs. RCNN) and multi-task (classification vs. localization). To
resolve these issues, we propose a simple yet effective architecture, named
Decoupled Faster R-CNN (DeFRCN). To be concrete, we extend Faster R-CNN by
introducing Gradient Decoupled Layer for multi-stage decoupling and
Prototypical Calibration Block for multi-task decoupling. The former is a novel
deep layer with redefining the feature-forward operation and gradient-backward
operation for decoupling its subsequent layer and preceding layer, and the
latter is an offline prototype-based classification model with taking the
proposals from detector as input and boosting the original classification
scores with additional pairwise scores for calibration. Extensive experiments
on multiple benchmarks show our framework is remarkably superior to other
existing approaches and establishes a new state-of-the-art in few-shot
literature.
Related papers
- Informed deep hierarchical classification: a non-standard analysis inspired approach [0.0]
It consists in a multi-output deep neural network equipped with specific projection operators placed before each output layer.
The design of such an architecture, called lexicographic hybrid deep neural network (LH-DNN), has been possible by combining tools from different and quite distant research fields.
To assess the efficacy of the approach, the resulting network is compared against the B-CNN, a convolutional neural network tailored for hierarchical classification tasks.
arXiv Detail & Related papers (2024-09-25T14:12:50Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Fast Hierarchical Learning for Few-Shot Object Detection [57.024072600597464]
Transfer learning approaches have recently achieved promising results on the few-shot detection task.
These approaches suffer from catastrophic forgetting'' issue due to finetuning of base detector.
We tackle the aforementioned issues in this work.
arXiv Detail & Related papers (2022-10-10T20:31:19Z) - ReAct: Temporal Action Detection with Relational Queries [84.76646044604055]
This work aims at advancing temporal action detection (TAD) using an encoder-decoder framework with action queries.
We first propose a relational attention mechanism in the decoder, which guides the attention among queries based on their relations.
Lastly, we propose to predict the localization quality of each action query at inference in order to distinguish high-quality queries.
arXiv Detail & Related papers (2022-07-14T17:46:37Z) - Plug-and-Play Few-shot Object Detection with Meta Strategy and Explicit
Localization Inference [78.41932738265345]
This paper proposes a plug detector that can accurately detect the objects of novel categories without fine-tuning process.
We introduce two explicit inferences into the localization process to reduce its dependence on annotated data.
It shows a significant lead in both efficiency, precision, and recall under varied evaluation protocols.
arXiv Detail & Related papers (2021-10-26T03:09:57Z) - Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with
Attentive Feature Alignment [33.446875089255876]
Few-shot object detection (FSOD) aims to detect objects using only few examples.
We propose a meta-learning based few-shot object detection method by transferring meta-knowledge learned from data-abundant base classes to data-scarce novel classes.
arXiv Detail & Related papers (2021-04-15T19:01:27Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.