Detecting Small Objects in Thermal Images Using Single-Shot Detector
- URL: http://arxiv.org/abs/2108.11101v1
- Date: Wed, 25 Aug 2021 07:54:36 GMT
- Title: Detecting Small Objects in Thermal Images Using Single-Shot Detector
- Authors: Hao Zhang, Xianggong Hong, and Li Zhu
- Abstract summary: SSD (Single Shot Multibox Detector) is one of the most successful object detectors for its high accuracy and fast speed.
In this paper, we proposed an enhanced SSD with a novel feature fusion module which can improve the performance over SSD for small object detection.
- Score: 12.72157936831052
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: SSD (Single Shot Multibox Detector) is one of the most successful object
detectors for its high accuracy and fast speed. However, the features from
shallow layer (mainly Conv4_3) of SSD lack semantic information, resulting in
poor performance in small objects. In this paper, we proposed DDSSD (Dilation
and Deconvolution Single Shot Multibox Detector), an enhanced SSD with a novel
feature fusion module which can improve the performance over SSD for small
object detection. In the feature fusion module, dilation convolution module is
utilized to enlarge the receptive field of features from shallow layer and
deconvolution module is adopted to increase the size of feature maps from high
layer. Our network achieves 79.7% mAP on PASCAL VOC2007 test and 28.3% mmAP on
MS COCO test-dev at 41 FPS with only 300x300 input using a single Nvidia 1080
GPU. Especially, for small objects, DDSSD achieves 10.5% on MS COCO and 22.8%
on FLIR thermal dataset, outperforming a lot of state-of-the-art object
detection algorithms in both aspects of accuracy and speed.
Related papers
- 3D Small Object Detection with Dynamic Spatial Pruning [62.72638845817799]
We propose an efficient feature pruning strategy for 3D small object detection.
We present a multi-level 3D detector named DSPDet3D which benefits from high spatial resolution.
It takes less than 2s to directly process a whole building consisting of more than 4500k points while detecting out almost all objects.
arXiv Detail & Related papers (2023-05-05T17:57:04Z) - Fewer is More: Efficient Object Detection in Large Aerial Images [59.683235514193505]
This paper presents an Objectness Activation Network (OAN) to help detectors focus on fewer patches but achieve more efficient inference and more accurate results.
Using OAN, all five detectors acquire more than 30.0% speed-up on three large-scale aerial image datasets.
We extend our OAN to driving-scene object detection and 4K video object detection, boosting the detection speed by 112.1% and 75.0%, respectively.
arXiv Detail & Related papers (2022-12-26T12:49:47Z) - Precise Single-stage Detector [2.2719729705587155]
We propose a modified version of Single Shot Multibox Detector (SSD) named Precise Single Stage Detector (PSSD)
In order to address these aforementioned issues, we propose a new architecture, named Precise Single Stage Detector (PSSD)
arXiv Detail & Related papers (2022-10-09T12:58:37Z) - DPNet: Dual-Path Network for Real-time Object Detection with Lightweight
Attention [15.360769793764526]
This paper presents a dual-path network, named DPNet, with a lightweight attention scheme for real-time object detection.
DPNet achieves state-of-the-art trade-off between detection accuracy and implementation efficiency.
arXiv Detail & Related papers (2022-09-28T09:11:01Z) - ETAD: A Unified Framework for Efficient Temporal Action Detection [70.21104995731085]
Untrimmed video understanding such as temporal action detection (TAD) often suffers from the pain of huge demand for computing resources.
We build a unified framework for efficient end-to-end temporal action detection (ETAD)
ETAD achieves state-of-the-art performance on both THUMOS-14 and ActivityNet-1.3.
arXiv Detail & Related papers (2022-05-14T21:16:21Z) - SALISA: Saliency-based Input Sampling for Efficient Video Object
Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection.
We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z) - EAutoDet: Efficient Architecture Search for Object Detection [110.99532343155073]
EAutoDet framework can discover practical backbone and FPN architectures for object detection in 1.4 GPU-days.
We propose a kernel reusing technique by sharing the weights of candidate operations on one edge and consolidating them into one convolution.
In particular, the discovered architectures surpass state-of-the-art object detection NAS methods and achieve 40.1 mAP with 120 FPS and 49.2 mAP with 41.3 FPS on COCO test-dev set.
arXiv Detail & Related papers (2022-03-21T05:56:12Z) - VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and
Stereo Data Fusion [62.24001258298076]
VPFNet is a new architecture that cleverly aligns and aggregates the point cloud and image data at the virtual' points.
Our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021.
arXiv Detail & Related papers (2021-11-29T08:51:20Z) - Small Object Detection Based on Modified FSSD and Model Compression [7.387639662781843]
This paper proposes a small object detection algorithm based on FSSD.
In order to reduce the computational cost and storage space, pruning is carried out to achieve model compression.
The average accuracy (mAP) of the algorithm can reach 80.4% on PASCAL VOC and the speed is 59.5 FPS on GTX1080ti.
arXiv Detail & Related papers (2021-08-24T03:20:32Z) - DeepSperm: A robust and real-time bull sperm-cell detection in densely
populated semen videos [26.494850349599528]
This study proposes an architecture, called DeepSperm, that solves the challenges and is more accurate and faster than state-of-the-art architectures.
In our experiment, we achieve 86.91 mAP on the test dataset and a processing speed of 50.3 fps.
arXiv Detail & Related papers (2020-03-03T09:05:05Z) - FSSD: Feature Fusion Single Shot Multibox Detector [8.016875965887815]
FSSD (Feature Fusion Single Shot Multibox Detector) is an enhanced SSD with a novel and lightweight feature fusion module.
Our network can achieve 82.7 mAP (mean average precision) at the speed of 65.8 FPS (frame per second) with the input size 300$times$300 using a single Nvidia 1080Ti GPU.
arXiv Detail & Related papers (2017-12-04T09:05:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.