Dual Refinement Feature Pyramid Networks for Object Detection
- URL: http://arxiv.org/abs/2012.01733v2
- Date: Fri, 4 Dec 2020 02:57:40 GMT
- Title: Dual Refinement Feature Pyramid Networks for Object Detection
- Authors: Jialiang Ma, Bin Chen
- Abstract summary: FPN is a common component used in object detectors, it supplements multi-scale information by adjacent level features and summation.
In this paper, we analyze the design defects from pixel level and feature map level.
We design a novel parameter-free feature pyramid networks named Dual Refinement Feature Pyramid Networks for the problems.
- Score: 2.88935873409577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: FPN is a common component used in object detectors, it supplements
multi-scale information by adjacent level features interpolation and summation.
However, due to the existence of nonlinear operations and the convolutional
layers with different output dimensions, the relationship between different
levels is much more complex, the pixel-wise summation is not an efficient
approach. In this paper, we first analyze the design defects from pixel level
and feature map level. Then, we design a novel parameter-free feature pyramid
networks named Dual Refinement Feature Pyramid Networks (DRFPN) for the
problems. Specifically, DRFPN consists of two modules: Spatial Refinement Block
(SRB) and Channel Refinement Block (CRB). SRB learns the location and content
of sampling points based on contextual information between adjacent levels. CRB
learns an adaptive channel merging method based on attention mechanism. Our
proposed DRFPN can be easily plugged into existing FPN-based models. Without
bells and whistles, for two-stage detectors, our model outperforms different
FPN-based counterparts by 1.6 to 2.2 AP on the COCO detection benchmark, and
1.5 to 1.9 AP on the COCO segmentation benchmark. For one-stage detectors,
DRFPN improves anchor-based RetinaNet by 1.9 AP and anchor-free FCOS by 1.3 AP
when using ResNet50 as backbone. Extensive experiments verifies the robustness
and generalization ability of DRFPN. The code will be made publicly available.
Related papers
- Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction.
Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features.
RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Selective Multi-Scale Learning for Object Detection [35.08958597150306]
RetinaNet combined with SMSL obtains 1.8% improvement in AP (from 39.1% to 40.9%) on COCO dataset.
When integrated with SMSL, two-stage detectors can get around 1.0% improvement in AP.
arXiv Detail & Related papers (2022-06-16T14:23:50Z) - RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object
Detection [10.847953426161924]
We propose RCNet, which consists of Reverse Feature Pyramid (RevFP) and Cross-scale Shift Network (CSN)
RevFP utilizes local bidirectional feature fusion to simplify the bidirectional pyramid inference pipeline.
CSN directly propagates representations to both adjacent and non-adjacent levels to enable multi-scale features more correlative.
arXiv Detail & Related papers (2021-10-23T04:00:25Z) - Disentangle Your Dense Object Detector [82.22771433419727]
Deep learning-based dense object detectors have achieved great success in the past few years and have been applied to numerous multimedia applications such as video understanding.
However, the current training pipeline for dense detectors is compromised to lots of conjunctions that may not hold.
We propose Disentangled Dense Object Detector (DDOD), in which simple and effective disentanglement mechanisms are designed and integrated into the current state-of-the-art detectors.
arXiv Detail & Related papers (2021-07-07T00:52:16Z) - FCCDN: Feature Constraint Network for VHR Image Change Detection [12.670734830806591]
We propose a feature constraint change detection network (FCCDN) for change detection.
We constrain features both on bi-temporal feature extraction and feature fusion.
We achieve state-of-the-art performance on two building change detection datasets.
arXiv Detail & Related papers (2021-05-23T06:13:47Z) - A^2-FPN: Attention Aggregation based Feature Pyramid Network for
Instance Segmentation [68.10621089649486]
We propose Attention Aggregation based Feature Pyramid Network (A2-FPN) to improve multi-scale feature learning.
A2-FPN achieves an improvement of 2.0% and 1.4% mask AP when integrated into the strong baselines such as Cascade Mask R-CNN and Hybrid Task Cascade.
arXiv Detail & Related papers (2021-05-07T11:51:08Z) - MRDet: A Multi-Head Network for Accurate Oriented Object Detection in
Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors.
To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network.
Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z) - Suppress and Balance: A Simple Gated Network for Salient Object
Detection [89.88222217065858]
We propose a simple gated network (GateNet) to solve both issues at once.
With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.
In addition, we adopt the atrous spatial pyramid pooling based on the proposed "Fold" operation (Fold-ASPP) to accurately localize salient objects of various scales.
arXiv Detail & Related papers (2020-07-16T02:00:53Z) - A novel Region of Interest Extraction Layer for Instance Segmentation [3.5493798890908104]
This paper is motivated by the need to overcome the limitations of existing RoI extractors.
The proposed layer (called Generic RoI Extractor - GRoIE) introduces non-local building blocks and attention mechanisms to boost the performance.
GRoIE can be integrated seamlessly with every two-stage architecture for both object detection and instance segmentation tasks.
arXiv Detail & Related papers (2020-04-28T17:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.