Parallel Residual Bi-Fusion Feature Pyramid Network for Accurate
Single-Shot Object Detection
- URL: http://arxiv.org/abs/2012.01724v5
- Date: Thu, 18 May 2023 15:33:06 GMT
- Title: Parallel Residual Bi-Fusion Feature Pyramid Network for Accurate
Single-Shot Object Detection
- Authors: Ping-Yang Chen, Ming-Ching Chang, Jun-Wei Hsieh, Yong-Sheng Chen
- Abstract summary: This paper proposes the Parallel Residual Bi-Fusion Feature Pyramid Network (PRB-FPN) for fast and accurate single-shot object detection.
The proposed network achieves state-of-the-art performance on the UAVDT17 and MS COCO datasets.
- Score: 22.817918566911203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes the Parallel Residual Bi-Fusion Feature Pyramid Network
(PRB-FPN) for fast and accurate single-shot object detection. Feature Pyramid
(FP) is widely used in recent visual detection, however the top-down pathway of
FP cannot preserve accurate localization due to pooling shifting. The advantage
of FP is weakened as deeper backbones with more layers are used. In addition,
it cannot keep up accurate detection of both small and large objects at the
same time. To address these issues, we propose a new parallel FP structure with
bi-directional (top-down and bottom-up) fusion and associated improvements to
retain high-quality features for accurate localization. We provide the
following design improvements: (1) A parallel bifusion FP structure with a
bottom-up fusion module (BFM) to detect both small and large objects at once
with high accuracy. (2) A concatenation and re-organization (CORE) module
provides a bottom-up pathway for feature fusion, which leads to the
bi-directional fusion FP that can recover lost information from lower-layer
feature maps. (3) The CORE feature is further purified to retain richer
contextual information. Such CORE purification in both top-down and bottom-up
pathways can be finished in only a few iterations. (4) The adding of a residual
design to CORE leads to a new Re-CORE module that enables easy training and
integration with a wide range of deeper or lighter backbones. The proposed
network achieves state-of-the-art performance on the UAVDT17 and MS COCO
datasets. Code is available at https://github.com/pingyang1117/PRBNet_PyTorch.
Related papers
- LR-FPN: Enhancing Remote Sensing Object Detection with Location Refined Feature Pyramid Network [2.028685490378346]
We propose a novel location refined feature pyramid network (LR-FPN) to enhance the extraction of shallow positional information.
Experiments on two large-scale remote sensing datasets demonstrate that the proposed LR-FPN is superior to state-of-the-art object detection approaches.
arXiv Detail & Related papers (2024-04-02T03:36:07Z) - Transformer-based Context Condensation for Boosting Feature Pyramids in
Object Detection [77.50110439560152]
Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF)
We propose a novel and efficient context modeling mechanism that can help existing FPs deliver better MFF results.
In particular, we introduce a novel insight that comprehensive contexts can be decomposed and condensed into two types of representations for higher efficiency.
arXiv Detail & Related papers (2022-07-14T01:45:03Z) - DFTR: Depth-supervised Hierarchical Feature Fusion Transformer for
Salient Object Detection [44.94166578314837]
We propose a pure Transformer-based SOD framework, namely Depth-supervised hierarchical feature Fusion TRansformer (DFTR)
We extensively evaluate the proposed DFTR on ten benchmarking datasets. Experimental results show that our DFTR consistently outperforms the existing state-of-the-art methods for both RGB and RGB-D SOD tasks.
arXiv Detail & Related papers (2022-03-12T12:59:12Z) - RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object
Detection [10.847953426161924]
We propose RCNet, which consists of Reverse Feature Pyramid (RevFP) and Cross-scale Shift Network (CSN)
RevFP utilizes local bidirectional feature fusion to simplify the bidirectional pyramid inference pipeline.
CSN directly propagates representations to both adjacent and non-adjacent levels to enable multi-scale features more correlative.
arXiv Detail & Related papers (2021-10-23T04:00:25Z) - LC3Net: Ladder context correlation complementary network for salient
object detection [0.32116198597240836]
We propose a novel ladder context correlation complementary network (LC3Net)
FCB is a filterable convolution block to assist the automatic collection of information on the diversity of initial features.
DCM is a dense cross module to facilitate the intimate aggregation of different levels of features.
BCD is a bidirectional compression decoder to help the progressive shrinkage of multi-scale features.
arXiv Detail & Related papers (2021-10-21T03:12:32Z) - A^2-FPN: Attention Aggregation based Feature Pyramid Network for
Instance Segmentation [68.10621089649486]
We propose Attention Aggregation based Feature Pyramid Network (A2-FPN) to improve multi-scale feature learning.
A2-FPN achieves an improvement of 2.0% and 1.4% mask AP when integrated into the strong baselines such as Cascade Mask R-CNN and Hybrid Task Cascade.
arXiv Detail & Related papers (2021-05-07T11:51:08Z) - $P^2$ Net: Augmented Parallel-Pyramid Net for Attention Guided Pose
Estimation [69.25492391672064]
We propose an augmented Parallel-Pyramid Net with feature refinement by dilated bottleneck and attention module.
A parallel-pyramid structure is followed to compensate the information loss introduced by the network.
Our method achieves the best performance on the challenging MSCOCO and MPII datasets.
arXiv Detail & Related papers (2020-10-26T02:10:12Z) - AutoPose: Searching Multi-Scale Branch Aggregation for Pose Estimation [96.29533512606078]
We present AutoPose, a novel neural architecture search(NAS) framework.
It is capable of automatically discovering multiple parallel branches of cross-scale connections towards accurate and high-resolution 2D human pose estimation.
arXiv Detail & Related papers (2020-08-16T22:27:43Z) - Suppress and Balance: A Simple Gated Network for Salient Object
Detection [89.88222217065858]
We propose a simple gated network (GateNet) to solve both issues at once.
With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.
In addition, we adopt the atrous spatial pyramid pooling based on the proposed "Fold" operation (Fold-ASPP) to accurately localize salient objects of various scales.
arXiv Detail & Related papers (2020-07-16T02:00:53Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.