YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection
- URL: http://arxiv.org/abs/2410.11727v1
- Date: Tue, 15 Oct 2024 16:00:01 GMT
- Title: YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection
- Authors: Olalekan Akindele, Joshua Atolagbe,
- Abstract summary: Existing detection methods for insulator defect identification from unmanned aerial vehicles struggle with complex background scenes and small objects.
This paper proposes a new attention-based foundation architecture, YOLO-ELA, to address this issue.
Experimental results on high-resolution UAV images show that our method achieved a state-of-the-art performance of 96.9% mAP0.5 and a real-time detection speed of 74.63 frames per second.
- Score: 0.0
- License:
- Abstract: Existing detection methods for insulator defect identification from unmanned aerial vehicles (UAV) struggle with complex background scenes and small objects, leading to suboptimal accuracy and a high number of false positives detection. Using the concept of local attention modeling, this paper proposes a new attention-based foundation architecture, YOLO-ELA, to address this issue. The Efficient Local Attention (ELA) blocks were added into the neck part of the one-stage YOLOv8 architecture to shift the model's attention from background features towards features of insulators with defects. The SCYLLA Intersection-Over-Union (SIoU) criterion function was used to reduce detection loss, accelerate model convergence, and increase the model's sensitivity towards small insulator defects, yielding higher true positive outcomes. Due to a limited dataset, data augmentation techniques were utilized to increase the diversity of the dataset. In addition, we leveraged the transfer learning strategy to improve the model's performance. Experimental results on high-resolution UAV images show that our method achieved a state-of-the-art performance of 96.9% mAP0.5 and a real-time detection speed of 74.63 frames per second, outperforming the baseline model. This further demonstrates the effectiveness of attention-based convolutional neural networks (CNN) in object detection tasks.
Related papers
- CCi-YOLOv8n: Enhanced Fire Detection with CARAFE and Context-Guided Modules [0.0]
Fire incidents in urban and forested areas pose serious threats.
We present CCi-YOLOv8n, an enhanced YOLOv8 model with targeted improvements for detecting small fires and smoke.
arXiv Detail & Related papers (2024-11-17T09:31:04Z) - Optimizing YOLO Architectures for Optimal Road Damage Detection and Classification: A Comparative Study from YOLOv7 to YOLOv10 [0.0]
This paper presents a comprehensive workflow for road damage detection using deep learning models.
To accommodate hardware limitations, large images are cropped, and lightweight models are utilized.
The proposed approach employs multiple model architectures, including a custom YOLOv7 model with Coordinate Attention layers and a Tiny YOLOv7 model.
arXiv Detail & Related papers (2024-10-10T22:55:12Z) - Spatial Transformer Network YOLO Model for Agricultural Object Detection [0.3124884279860061]
We propose a new method that integrates spatial transformer networks (STNs) into YOLO to improve performance.
The proposed STN-YOLO aims to enhance the model's effectiveness by focusing on important areas of the image.
We apply the STN-YOLO on benchmark datasets for Agricultural object detection as well as a new dataset from a state-of-the-art plant phenotyping greenhouse facility.
arXiv Detail & Related papers (2024-07-31T14:53:41Z) - YOLO9tr: A Lightweight Model for Pavement Damage Detection Utilizing a Generalized Efficient Layer Aggregation Network and Attention Mechanism [0.0]
This paper proposes YOLO9tr, a novel lightweight object detection model for pavement damage detection.
YOLO9tr is based on the YOLOv9 architecture, incorporating a partial attention block that enhances feature extraction and attention mechanisms.
The model achieves a high frame rate of up to 136 FPS, making it suitable for real-time applications such as video surveillance and automated inspection systems.
arXiv Detail & Related papers (2024-06-17T06:31:43Z) - Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets.
We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z) - Diffusion-Based Particle-DETR for BEV Perception [94.88305708174796]
Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs)
Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively detect small objects in the large coverage of the BEV.
Here, we address this problem by combining the diffusion paradigm with current state-of-the-art 3D object detectors in BEV.
arXiv Detail & Related papers (2023-12-18T09:52:14Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - A Computer Vision Enabled damage detection model with improved YOLOv5
based on Transformer Prediction Head [0.0]
Current state-of-the-art deep learning (DL)-based damage detection models often lack superior feature extraction capability in complex and noisy environments.
DenseSPH-YOLOv5 is a real-time DL-based high-performance damage detection model where DenseNet blocks have been integrated with the backbone.
DenseSPH-YOLOv5 obtains a mean average precision (mAP) value of 85.25 %, F1-score of 81.18 %, and precision (P) value of 89.51 % outperforming current state-of-the-art models.
arXiv Detail & Related papers (2023-03-07T22:53:36Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images.
Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.