Related papers: Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection

Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection

URL: http://arxiv.org/abs/2408.07455v1
Date: Wed, 14 Aug 2024 10:49:14 GMT
Title: Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection
Authors: Zhonglin Chen, Anyu Geng, Jianan Jiang, Jiwu Lu, Di Wu,
Abstract summary: A new dataset named InfraTiny was constructed, and more than 85% bounding box is less than 32x32 pixels (3218 images and a total of 20,893 bounding boxes) A multi-scale attention mechanism module (MSAM) and a Feature Fusion Augmentation Pyramid Module (FFAFPM) were proposed and deployed onto embedded devices. By integrating the proposed methods into the YOLO model, which is named Infra-YOLO, infrared small object detection performance has been improved.
Score: 4.586010474241955
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Although convolutional neural networks have made outstanding achievements in visible light target detection, there are still many challenges in infrared small object detection because of the low signal-to-noise ratio, incomplete object structure, and a lack of reliable infrared small object dataset. To resolve limitations of the infrared small object dataset, a new dataset named InfraTiny was constructed, and more than 85% bounding box is less than 32x32 pixels (3218 images and a total of 20,893 bounding boxes). A multi-scale attention mechanism module (MSAM) and a Feature Fusion Augmentation Pyramid Module (FFAFPM) were proposed and deployed onto embedded devices. The MSAM enables the network to obtain scale perception information by acquiring different receptive fields, while the background noise information is suppressed to enhance feature extraction ability. The proposed FFAFPM can enrich semantic information, and enhance the fusion of shallow feature and deep feature, thus false positive results have been significantly reduced. By integrating the proposed methods into the YOLO model, which is named Infra-YOLO, infrared small object detection performance has been improved. Compared to yolov3, mAP@0.5 has been improved by 2.7%; and compared to yolov4, that by 2.5% on the InfraTiny dataset. The proposed Infra-YOLO was also transferred onto the embedded device in the unmanned aerial vehicle (UAV) for real application scenarios, where the channel pruning method is adopted to reduce FLOPs and to achieve a tradeoff between speed and accuracy. Even if the parameters of Infra-YOLO are reduced by 88% with the pruning method, a gain of 0.7% is still achieved on mAP@0.5 compared to yolov3, and a gain of 0.5% compared to yolov4. Experimental results show that the proposed MSAM and FFAFPM method can improve infrared small object detection performance compared with the previous benchmark method.

Related papers

ISTD-YOLO: A Multi-Scale Lightweight High-Performance Infrared Small Target Detection Algorithm [0.3749861135832073]
ISTD-YOLO is a lightweight infrared small target detection algorithm based on improved YOLOv7. ISTD-YOLO can effectively improve the detection effect, and all indicators are effectively improved.
arXiv Detail & Related papers (2025-04-19T13:19:54Z)
Multi-Domain Biometric Recognition using Body Embeddings [51.36007967653781]
We show that body embeddings perform better than face embeddings in medium-wave infrared (MWIR) and long-wave infrared (LWIR) domains. We leverage a vision transformer architecture to establish benchmark results on the IJB-MDF dataset. We also show that finetuning a body model, pretrained exclusively on VIS data, with a simple combination of cross-entropy and triplet losses achieves state-of-the-art mAP scores.
arXiv Detail & Related papers (2025-03-13T22:38:18Z)
SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds. With the development of Transformer, the scale of SIRST models is constantly increasing. With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z)
SpirDet: Towards Efficient, Accurate and Lightweight Infrared Small Target Detector [60.42293239557962]
We propose SpirDet, a novel approach for efficient detection of infrared small targets. We employ a new dual-branch sparse decoder to restore the feature map. Extensive experiments show that the proposed SpirDet significantly outperforms state-of-the-art models.
arXiv Detail & Related papers (2024-02-08T05:06:14Z)
ILNet: Low-level Matters for Salient Infrared Small Target Detection [5.248337726304453]
Infrared small target detection is a technique for finding small targets from infrared clutter background. Due to the dearth of high-level semantic information, small infrared target features are weakened in the deep layers of the CNN. We propose an infrared low-level network (ILNet) that considers infrared small targets as salient areas with little semantic information.
arXiv Detail & Related papers (2023-09-24T14:09:37Z)
EFLNet: Enhancing Feature Learning for Infrared Small Target Detection [20.546186772828555]
Single-frame infrared small target detection is considered to be a challenging task. Due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small target. We propose an enhancing feature learning network (EFLNet) to address these problems.
arXiv Detail & Related papers (2023-07-27T09:23:22Z)
Robust Environment Perception for Automated Driving: A Unified Learning Pipeline for Visual-Infrared Object Detection [2.478658210785]
We exploit both visual and thermal perception units for robust object detection purposes. In this paper, we exploit both visual and thermal perception units for robust object detection purposes.
arXiv Detail & Related papers (2022-06-08T15:02:58Z)
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection. YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation. YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)
Infrared Small-Dim Target Detection with Transformer under Complex Backgrounds [155.388487263872]
We propose a new infrared small-dim target detection method with the transformer. We adopt the self-attention mechanism of the transformer to learn the interaction information of image features in a larger range. We also design a feature enhancement module to learn more features of small-dim targets.
arXiv Detail & Related papers (2021-09-29T12:23:41Z)
EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics. In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z)
Dense Nested Attention Network for Infrared Small Target Detection [36.654692765557726]
Single-frame infrared small target (SIRST) detection aims at separating small targets from clutter backgrounds. Existing CNN-based methods cannot be directly applied for infrared small targets. We propose a dense nested attention network (DNANet) in this paper.
arXiv Detail & Related papers (2021-06-01T13:45:35Z)
Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection. The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.