PETDet: Proposal Enhancement for Two-Stage Fine-Grained Object Detection
- URL: http://arxiv.org/abs/2312.10515v1
- Date: Sat, 16 Dec 2023 18:04:56 GMT
- Title: PETDet: Proposal Enhancement for Two-Stage Fine-Grained Object Detection
- Authors: Wentao Li, Danpei Zhao, Bo Yuan, Yue Gao, Zhenwei Shi
- Abstract summary: We present PETDet (Proposal Enhancement for Two-stage fine-grained object detection) to better handle the sub-tasks in two-stage FGOD methods.
An anchor-free Quality Oriented Proposal Network (QOPN) is proposed with dynamic label assignment and attention-based decomposition.
A novel Adaptive Recognition Loss (ARL) offers guidance for the R-CNN head to focus on high-quality proposals.
- Score: 26.843891792018447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-grained object detection (FGOD) extends object detection with the
capability of fine-grained recognition. In recent two-stage FGOD methods, the
region proposal serves as a crucial link between detection and fine-grained
recognition. However, current methods overlook that some proposal-related
procedures inherited from general detection are not equally suitable for FGOD,
limiting the multi-task learning from generation, representation, to
utilization. In this paper, we present PETDet (Proposal Enhancement for
Two-stage fine-grained object detection) to better handle the sub-tasks in
two-stage FGOD methods. Firstly, an anchor-free Quality Oriented Proposal
Network (QOPN) is proposed with dynamic label assignment and attention-based
decomposition to generate high-quality oriented proposals. Additionally, we
present a Bilinear Channel Fusion Network (BCFN) to extract independent and
discriminative features of the proposals. Furthermore, we design a novel
Adaptive Recognition Loss (ARL) which offers guidance for the R-CNN head to
focus on high-quality proposals. Extensive experiments validate the
effectiveness of PETDet. Quantitative analysis reveals that PETDet with
ResNet50 reaches state-of-the-art performance on various FGOD datasets,
including FAIR1M-v1.0 (42.96 AP), FAIR1M-v2.0 (48.81 AP), MAR20 (85.91 AP) and
ShipRSImageNet (74.90 AP). The proposed method also achieves superior
compatibility between accuracy and inference speed. Our code and models will be
released at https://github.com/canoe-Z/PETDet.
Related papers
- Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Benchmarking Deep Models for Salient Object Detection [67.07247772280212]
We construct a general SALient Object Detection (SALOD) benchmark to conduct a comprehensive comparison among several representative SOD methods.
In the above experiments, we find that existing loss functions usually specialized in some metrics but reported inferior results on the others.
We propose a novel Edge-Aware (EA) loss that promotes deep networks to learn more discriminative features by integrating both pixel- and image-level supervision signals.
arXiv Detail & Related papers (2022-02-07T03:43:16Z) - Decoupling Object Detection from Human-Object Interaction Recognition [37.133695677465376]
DEFR is a DEtection-FRee method to recognize Human-Object Interactions (HOI) at image level without using object location or human pose.
We propose two findings to boost the performance of the detection-free approach, which significantly outperforms the detection-assisted state of the arts.
arXiv Detail & Related papers (2021-12-13T03:01:49Z) - Plug-and-Play Few-shot Object Detection with Meta Strategy and Explicit
Localization Inference [78.41932738265345]
This paper proposes a plug detector that can accurately detect the objects of novel categories without fine-tuning process.
We introduce two explicit inferences into the localization process to reduce its dependence on annotated data.
It shows a significant lead in both efficiency, precision, and recall under varied evaluation protocols.
arXiv Detail & Related papers (2021-10-26T03:09:57Z) - Oriented R-CNN for Object Detection [61.78746189807462]
This work proposes an effective and simple oriented object detection framework, termed Oriented R-CNN.
In the first stage, we propose an oriented Region Proposal Network (oriented RPN) that directly generates high-quality oriented proposals in a nearly cost-free manner.
The second stage is oriented R-CNN head for refining oriented Regions of Interest (oriented RoIs) and recognizing them.
arXiv Detail & Related papers (2021-08-12T12:47:43Z) - MRDet: A Multi-Head Network for Accurate Oriented Object Detection in
Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors.
To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network.
Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z) - AFD-Net: Adaptive Fully-Dual Network for Few-Shot Object Detection [8.39479809973967]
Few-shot object detection (FSOD) aims at learning a detector that can fast adapt to previously unseen objects with scarce examples.
Existing methods solve this problem by performing subtasks of classification and localization utilizing a shared component.
We present that a general few-shot detector should consider the explicit decomposition of two subtasks, as well as leveraging information from both of them to enhance feature representations.
arXiv Detail & Related papers (2020-11-30T10:21:32Z) - Corner Proposal Network for Anchor-free, Two-stage Object Detection [174.59360147041673]
The goal of object detection is to determine the class and location of objects in an image.
This paper proposes a novel anchor-free, two-stage framework which first extracts a number of object proposals.
We demonstrate that these two stages are effective solutions for improving recall and precision.
arXiv Detail & Related papers (2020-07-27T19:04:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.