YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5
- URL: http://arxiv.org/abs/2402.14309v2
- Date: Mon, 29 Jul 2024 01:48:25 GMT
- Title: YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5
- Authors: Chun-Lin Ji, Tao Yu, Peng Gao, Fei Wang, Ru-Yue Yuan,
- Abstract summary: YOLO-TLA is an advanced object detection model building on YOLOv5.
We first introduce an additional detection layer for small objects in the neck network pyramid architecture.
This module uses sliding window feature extraction, which effectively minimizes both computational demand and the number of parameters.
- Score: 19.388112026410045
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Object detection, a crucial aspect of computer vision, has seen significant advancements in accuracy and robustness. Despite these advancements, practical applications still face notable challenges, primarily the inaccurate detection or missed detection of small objects. In this paper, we propose YOLO-TLA, an advanced object detection model building on YOLOv5. We first introduce an additional detection layer for small objects in the neck network pyramid architecture, thereby producing a feature map of a larger scale to discern finer features of small objects. Further, we integrate the C3CrossCovn module into the backbone network. This module uses sliding window feature extraction, which effectively minimizes both computational demand and the number of parameters, rendering the model more compact. Additionally, we have incorporated a global attention mechanism into the backbone network. This mechanism combines the channel information with global information to create a weighted feature map. This feature map is tailored to highlight the attributes of the object of interest, while effectively ignoring irrelevant details. In comparison to the baseline YOLOv5s model, our newly developed YOLO-TLA model has shown considerable improvements on the MS COCO validation dataset, with increases of 4.6% in mAP@0.5 and 4% in mAP@0.5:0.95, all while keeping the model size compact at 9.49M parameters. Further extending these improvements to the YOLOv5m model, the enhanced version exhibited a 1.7% and 1.9% increase in mAP@0.5 and mAP@0.5:0.95, respectively, with a total of 27.53M parameters. These results validate the YOLO-TLA model's efficient and effective performance in small object detection, achieving high accuracy with fewer parameters and computational demands.
Related papers
- SL-YOLO: A Stronger and Lighter Drone Target Detection Model [0.0]
This paper proposes a revolutionary model SL-YOLO (Stronger and Lighter YOLO) that aims to break the bottleneck of small target detection.
We propose a pioneering cross-scale feature fusion method that can ensure unparalleled detection accuracy even in the most challenging environments.
Our experimental results on the VisDrone 2019 dataset reveal a significant improvement in performance, with mAP@0.5 jumping from 43.0% to 46.9%.
The model parameters are reduced from 11.1M to 9.6M, and the FPS can reach 132, making it an ideal solution for real-time small object detection in resource-constrained environments.
arXiv Detail & Related papers (2024-11-18T11:26:11Z) - LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection [0.0]
We focus on design choices of neural network architectures for efficient object detection based on FLOP.
We propose several optimizations to enhance the efficiency of YOLO-based models.
This paper contributes to a new scaling paradigm for object detection and YOLO-centric models called LeYOLO.
arXiv Detail & Related papers (2024-06-20T12:08:24Z) - YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection.
The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs.
We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - HIC-YOLOv5: Improved YOLOv5 For Small Object Detection [2.4780916008623834]
An improved YOLOv5 model: HIC-YOLOv5 is proposed to address the aforementioned problems.
An involution block is adopted between the backbone and neck to increase channel information of the feature map.
Our result shows that HIC-YOLOv5 has improved mAP@[.5:.95] by 6.42% and mAP@0.5 by 9.38% on VisDrone 2019-DET dataset.
arXiv Detail & Related papers (2023-09-28T12:40:36Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time
Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework.
We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects.
Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z) - YOLOSA: Object detection based on 2D local feature superimposed
self-attention [13.307581544820248]
We propose a novel self-attention module, called 2D local feature superimposed self-attention, for the feature concatenation stage of the neck network.
Average precisions of 49.0% (66.2 FPS), 46.1% (80.6 FPS), and 39.1% (100 FPS) were obtained for large, medium, and small-scale models built using our proposed improvements.
arXiv Detail & Related papers (2022-06-23T16:49:21Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - Evaluation of YOLO Models with Sliced Inference for Small Object
Detection [0.0]
This work aims to benchmark the YOLOv5 and YOLOX models for small object detection.
The effects of sliced fine-tuning and sliced inference combined produced substantial improvement for all models.
arXiv Detail & Related papers (2022-03-09T15:24:30Z) - Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images.
Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.