SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from
UAV Images
- URL: http://arxiv.org/abs/2107.01548v1
- Date: Sun, 4 Jul 2021 05:46:41 GMT
- Title: SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from
UAV Images
- Authors: Mingbo Hong, Shuiwang Li, Yuchao Yang, Feiyu Zhu, Qijun Zhao and Li Lu
- Abstract summary: We propose a Scale Selection Pyramid network (SSPNet) for tiny person detection.
SSPNet consists of three components: Context Attention Module (CAM), Scale Enhancement Module (SEM), and Scale Selection Module (SSM)
Our method outperforms other state-of-the-art (SOTA) detectors.
- Score: 10.439155825343517
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the increasing demand for search and rescue, it is highly demanded to
detect objects of interest in large-scale images captured by Unmanned Aerial
Vehicles (UAVs), which is quite challenging due to extremely small scales of
objects. Most existing methods employed Feature Pyramid Network (FPN) to enrich
shallow layers' features by combing deep layers' contextual features. However,
under the limitation of the inconsistency in gradient computation across
different layers, the shallow layers in FPN are not fully exploited to detect
tiny objects. In this paper, we propose a Scale Selection Pyramid network
(SSPNet) for tiny person detection, which consists of three components: Context
Attention Module (CAM), Scale Enhancement Module (SEM), and Scale Selection
Module (SSM). CAM takes account of context information to produce hierarchical
attention heatmaps. SEM highlights features of specific scales at different
layers, leading the detector to focus on objects of specific scales instead of
vast backgrounds. SSM exploits adjacent layers' relationships to fulfill
suitable feature sharing between deep layers and shallow layers, thereby
avoiding the inconsistency in gradient computation across different layers.
Besides, we propose a Weighted Negative Sampling (WNS) strategy to guide the
detector to select more representative samples. Experiments on the TinyPerson
benchmark show that our method outperforms other state-of-the-art (SOTA)
detectors.
Related papers
- HCF-Net: Hierarchical Context Fusion Network for Infrared Small Object Detection [16.92362922379821]
We propose a deep learning method to improve infrared small object detection performance.
The method includes the parallelized patch-aware attention (PPA) module, dimension-aware selective integration (DASI) module, and multi-dilated channel refiner (MDCR) module.
arXiv Detail & Related papers (2024-03-16T02:45:42Z) - AMANet: Advancing SAR Ship Detection with Adaptive Multi-Hierarchical
Attention Network [0.5437298646956507]
A novel adaptive multi-hierarchical attention module (AMAM) is proposed to learn multi-scale features and adaptively aggregate salient features from various feature layers.
We first fuse information from adjacent feature layers to enhance the detection of smaller targets, thereby achieving multi-scale feature enhancement.
Thirdly, we present a novel adaptive multi-hierarchical attention network (AMANet) by embedding the AMAM between the backbone network and the feature pyramid network.
arXiv Detail & Related papers (2024-01-24T03:56:33Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Discovery-and-Selection: Towards Optimal Multiple Instance Learning for
Weakly Supervised Object Detection [86.86602297364826]
We propose a discoveryand-selection approach fused with multiple instance learning (DS-MIL)
Our proposed DS-MIL approach can consistently improve the baselines, reporting state-of-the-art performance.
arXiv Detail & Related papers (2021-10-18T07:06:57Z) - MRDet: A Multi-Head Network for Accurate Oriented Object Detection in
Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors.
To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network.
Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z) - Dense Multiscale Feature Fusion Pyramid Networks for Object Detection in
UAV-Captured Images [0.09065034043031667]
We propose a novel method called Dense Multiscale Feature Fusion Pyramid Networks(DMFFPN), which is aimed at obtaining rich features as much as possible.
Specifically, the dense connection is designed to fully utilize the representation from the different convolutional layers.
Experiments on the drone-based datasets named VisDrone-DET suggest a competitive performance of our method.
arXiv Detail & Related papers (2020-12-19T10:05:31Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - Extended Feature Pyramid Network for Small Object Detection [20.029591259254847]
We propose extended feature pyramid network (EFPN) with an extra high-resolution pyramid level specialized for small object detection.
Specifically, we design a novel module, named feature texture transfer (FTT), which is used to super-resolve features and extract credible regional details simultaneously.
In our experiments, the proposed EFPN is efficient on both computation and memory, and yields state-of-the-art results.
arXiv Detail & Related papers (2020-03-16T04:27:54Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z) - NETNet: Neighbor Erasing and Transferring Network for Better Single Shot
Object Detection [170.30694322460045]
We propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features.
A single-shot network called NETNet is constructed for scale-aware object detection.
arXiv Detail & Related papers (2020-01-18T15:21:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.