EPBC-YOLOv8: An efficient and accurate improved YOLOv8 underwater detector based on an attention mechanism
- URL: http://arxiv.org/abs/2502.05788v1
- Date: Sun, 09 Feb 2025 06:09:56 GMT
- Title: EPBC-YOLOv8: An efficient and accurate improved YOLOv8 underwater detector based on an attention mechanism
- Authors: Xing Jiang, Xiting Zhuang, Jisheng Chen, Jian Zhang,
- Abstract summary: We enhance underwater target detection by integrating channel and spatial attention into YOLOv8's backbone.<n>Our framework addresses underwater image degradation, achieving mAP at 0.5 scores of 76.7 percent and 79.0 percent on datasets.<n>These scores are 2.3 percent and 0.7 percent higher than the original YOLOv8, showcasing enhanced precision in detecting marine organisms.
- Score: 4.081096260595706
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, we enhance underwater target detection by integrating channel and spatial attention into YOLOv8's backbone, applying Pointwise Convolution in FasterNeXt for the FasterPW model, and leveraging Weighted Concat in a BiFPN-inspired WFPN structure for improved cross-scale connections and robustness. Utilizing CARAFE for refined feature reassembly, our framework addresses underwater image degradation, achieving mAP at 0.5 scores of 76.7 percent and 79.0 percent on URPC2019 and URPC2020 datasets, respectively. These scores are 2.3 percent and 0.7 percent higher than the original YOLOv8, showcasing enhanced precision in detecting marine organisms.
Related papers
- Adaptive Enhancement and Dual-Pooling Sequential Attention for Lightweight Underwater Object Detection with YOLOv10 [0.0]
This manuscript introduces a streamlined yet robust framework for underwater object detection, grounded in the YOLOv10 architecture.<n>The proposed method integrates a Multi-Stage Adaptive Enhancement module to improve image quality and a Dual-Pooling Sequential Attention mechanism to strengthen multi-scale feature representation.
arXiv Detail & Related papers (2026-03-04T07:39:57Z) - Denoising-Enhanced YOLO for Robust SAR Ship Detection [9.818917054838964]
CPN-YOLO is a high-precision ship detection framework built upon YOLOv8.<n>We introduce a learnable large- kernel denoising module for input pre-processing.<n>Second, we design a feature extraction enhancement strategy based on the attention mechanism to strengthen multi-scale modeling.
arXiv Detail & Related papers (2026-02-27T09:00:19Z) - MRS-YOLO Railroad Transmission Line Foreign Object Detection Based on Improved YOLO11 and Channel Pruning [2.6795746856835785]
We propose an improved algorithm MRS-YOLO based on YOLO11.<n>The mAP50 and mAP50:95 of the MRS-YOLO algorithm are improved to 94.8% and 86.4%, respectively.
arXiv Detail & Related papers (2025-10-12T11:38:09Z) - Real-Time Fish Detection in Indonesian Marine Ecosystems Using Lightweight YOLOv10-nano Architecture [0.0]
This study explores the implementation of YOLOv10-nano, a state-of-the-art deep learning model, for real-time marine fish detection in Indonesian waters.<n>YOLOv10's architecture, featuring improvements like the CSPNet backbone, PAN for feature fusion, and Pyramid Spatial Attention Block, enables efficient and accurate object detection.<n>Results show that YOLOv10-nano achieves a high detection accuracy with mAP50 of 0.966 and mAP50:95 of 0.606 while maintaining low computational demand.
arXiv Detail & Related papers (2025-09-22T07:02:48Z) - UAV Individual Identification via Distilled RF Fingerprints-Based LLM in ISAC Networks [60.16924915676577]
Unmanned aerial vehicle (UAV) individual (ID) identification is a critical security surveillance strategy in low-altitude integrated sensing and communication (ISAC) networks.<n>We propose a novel dynamic knowledge distillation (KD)-enabled wireless radio frequency fingerprint large language model (RFF-LLM) framework for UAV ID identification.<n>Experiment results show that the proposed framework achieves 98.38% ID identification accuracy with merely 0.15 million parameters and 2.74 ms response time.
arXiv Detail & Related papers (2025-08-18T03:14:44Z) - An Improved YOLOv8 Approach for Small Target Detection of Rice Spikelet Flowering in Field Environments [1.0288898584996287]
This study proposes a rice spikelet flowering recognition method based on an improved YOLOv8 object detection model.<n>BiFPN replaces the original PANet structure to enhance feature fusion and improve multi-scale feature utilization.<n>Given the lack of publicly available datasets for rice spikelet flowering in field conditions, a high-resolution RGB camera and data augmentation techniques are used.
arXiv Detail & Related papers (2025-07-28T04:01:29Z) - Joint Multi-Target Detection-Tracking in Cognitive Massive MIMO Radar via POMCP [42.99053410696693]
This correspondence presents a power-aware cognitive radar framework for joint detection and tracking of multiple targets.<n>Building on a previous single-target algorithm based on Partially Observable Monte Carlo Planning (POMCP), we extend it to the multi-target case by assigning each target an independent POMCP tree.<n>The proposed framework for the cognitive radar improves detection probability for low-SNR targets and achieves more accurate tracking compared to approaches using uniform or waveforms.
arXiv Detail & Related papers (2025-07-23T13:43:29Z) - YOLO-APD: Enhancing YOLOv8 for Robust Pedestrian Detection on Complex Road Geometries [0.0]
This paper introduces YOLO-APD, a novel deep learning architecture enhancing the YOLOv8 framework specifically for this challenge.<n>YOLO-APD achieves state-of-the-art detection accuracy, reaching 77.7% mAP@0.5:0.95 and exceptional pedestrian recall exceeding 96%.<n>It maintains real-time processing capabilities at 100 FPS, showcasing a superior balance between accuracy and efficiency.
arXiv Detail & Related papers (2025-07-07T18:03:40Z) - YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception [44.76134548023668]
We propose YOLOv13, an accurate and lightweight object detector.<n>We propose a Hypergraph-based Adaptive Correlation Enhancement (HyperACE) mechanism.<n>We also propose a Full-Pipeline Aggregation-and-Distribution (FullPAD) paradigm.
arXiv Detail & Related papers (2025-06-21T15:15:03Z) - You Sense Only Once Beneath: Ultra-Light Real-Time Underwater Object Detection [2.5249064981269296]
We propose an Ultra-Light Real-Time Underwater Object Detection framework, You Sense Only Once Beneath (YSOOB)
Specifically, we utilize a Multi-Spectrum Wavelet (MSWE) to perform frequency-domain encoding on the input image, minimizing the semantic loss caused by underwater optical color distortion.
We also eliminate model redundancy through a simple yet effective channel compression and reconstructed large kernel convolution (RLKC) to achieve model lightweight.
arXiv Detail & Related papers (2025-04-22T08:26:35Z) - TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion [54.46664104437454]
We propose TacoDepth, an efficient and accurate Radar-Camera depth estimation model with one-stage fusion.
Specifically, the graph-based Radar structure extractor and the pyramid-based Radar fusion module are designed.
Compared with the previous state-of-the-art approach, TacoDepth improves depth accuracy and processing speed by 12.8% and 91.8%.
arXiv Detail & Related papers (2025-04-16T05:25:04Z) - YOLO-LLTS: Real-Time Low-Light Traffic Sign Detection via Prior-Guided Enhancement and Multi-Branch Feature Interaction [45.79993863157494]
YOLO-LLTS is an end-to-end real-time traffic sign detection algorithm specifically designed for low-light environments.
We introduce the High-Resolution Feature Map for Small Object Detection (HRFM-TOD) module to address indistinct small-object features in low-light scenarios.
Secondly, we develop the Multi-branch Feature Interaction Attention (MFIA) module, which facilitates deep feature interaction across multiple receptive fields.
arXiv Detail & Related papers (2025-03-18T04:28:05Z) - A Light Perspective for 3D Object Detection [46.23578780480946]
This paper introduces a novel approach that incorporates cutting-edge Deep Learning techniques into the feature extraction process.
Our model, NextBEV, surpasses established feature extractors like ResNet50 and MobileNetV3.
By fusing these lightweight proposals, we have enhanced the accuracy of the VoxelNet-based model by 2.93% and improved the F1-score of the PointPillar-based model by approximately 20%.
arXiv Detail & Related papers (2025-03-10T10:03:23Z) - YOLO-PRO: Enhancing Instance-Specific Object Detection with Full-Channel Global Self-Attention [38.97680747773625]
This paper addresses the inherent limitations of conventional bottleneck structures in object detection frameworks.<n>It proposes two novel modules: the Instance-Specific Bottleneck with full-channel global self-attention (ISB) and the Instance-Specific Asymmetric Decoupled Head (ISADH)<n> experiments on the MS-COCO benchmark demonstrate that the coordinated deployment of ISB and ISADH in the YOLO-PRO framework achieves state-of-the-art performance across all computational scales.
arXiv Detail & Related papers (2025-03-04T07:17:02Z) - A method for detecting dead fish on large water surfaces based on improved YOLOv10 [0.6874745415692134]
Dead fish can cause significant issues such as water quality deterioration, ecosystem damage, and disease transmission.
This paper proposes an end-to-end detection model built upon an enhanced YOLOv10 framework.
arXiv Detail & Related papers (2024-08-31T08:43:37Z) - Fall Detection for Industrial Setups Using YOLOv8 Variants [0.0]
The YOLOv8m model, consisting of 25.9 million parameters and 79.1 GFLOPs, demonstrated a respectable balance between computational efficiency and detection performance.
Although the YOLOv8l and YOLOv8x models presented higher precision and recall, their higher computational demands and model size make them less suitable for resource-constrained environments.
arXiv Detail & Related papers (2024-08-08T17:24:54Z) - Cycle-YOLO: A Efficient and Robust Framework for Pavement Damage Detection [13.221462950649467]
An enhanced pavement damage detection method with CycleGAN and improved YOLOv5 algorithm is presented.
Our algorithm achieved a precision of 0.872, recall of 0.854, and mean average precision@0.5 of 0.882 in detecting three main types of pavement damage: cracks, potholes, and patching.
arXiv Detail & Related papers (2024-05-28T07:27:42Z) - DiffNAS: Bootstrapping Diffusion Models by Prompting for Better
Architectures [63.12993314908957]
We propose a base model search approach, denoted "DiffNAS"
We leverage GPT-4 as a supernet to expedite the search, supplemented with a search memory to enhance the results.
Rigorous experimentation corroborates that our algorithm can augment the search efficiency by 2 times under GPT-based scenarios.
arXiv Detail & Related papers (2023-10-07T09:10:28Z) - Enhancing Infrared Small Target Detection Robustness with Bi-Level
Adversarial Framework [61.34862133870934]
We propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions.
Our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark.
arXiv Detail & Related papers (2023-09-03T06:35:07Z) - AMSP-UOD: When Vortex Convolution and Stochastic Perturbation Meet
Underwater Object Detection [40.532331552038485]
We present a novel Amplitude-Modulated Perturbation and Vortex Convolutional Network, AMSP-UOD.
AMSP-UOD addresses the impact of non-ideal imaging factors on detection accuracy in complex underwater environments.
Our method outperforms existing state-of-the-art methods in terms of accuracy and noise immunity.
arXiv Detail & Related papers (2023-08-23T05:03:45Z) - Industrial Anomaly Detection and Localization Using Weakly-Supervised Residual Transformers [44.344548601242444]
We introduce a novel framework, Weakly-supervised RESidual Transformer (WeakREST), to achieve high anomaly detection accuracy.<n>We reformulate the pixel-wise anomaly localization task into a block-wise classification problem.<n>We develop a novel ResMixMatch algorithm, capable of handling the interplay between weak labels and residual-based representations.
arXiv Detail & Related papers (2023-06-06T08:19:30Z) - DeepSeaNet: Improving Underwater Object Detection using EfficientDet [0.0]
This project involves implementing and evaluating various object detection models on an annotated underwater dataset.
The dataset comprises annotated image sequences of fish, crabs, starfish, and other aquatic animals captured in Limfjorden water with limited visibility.
I compare the results of YOLOv3 (31.10% mean Average Precision (mAP)), YOLOv4 (83.72% mAP), YOLOv5 (97.6%), YOLOv8 (98.20%), EfficientDet (98.56% mAP) and Detectron2 (95.20% mAP) on the same dataset.
arXiv Detail & Related papers (2023-05-26T13:41:35Z) - Underwater target detection based on improved YOLOv7 [7.264267222876267]
This study proposes an improved YOLOv7 network (YOLOv7-AC) for underwater target detection.
The proposed network utilizes an ACmixBlock module to replace the 3x3 convolution block in the E-ELAN structure.
A ResNet-ACmix module is designed to avoid feature information loss and reduce computation.
arXiv Detail & Related papers (2023-02-14T09:50:52Z) - Simple Training Strategies and Model Scaling for Object Detection [38.27709720726833]
We benchmark improvements on the vanilla ResNet-FPN backbone with RetinaNet and RCNN detectors.
The vanilla detectors are improved by 7.7% in accuracy while being 30% faster in speed.
Our largest Cascade RCNN-RS models achieve 52.9% AP with a ResNet152-FPN backbone and 53.6% with a SpineNet143L backbone.
arXiv Detail & Related papers (2021-06-30T18:41:47Z) - TDAF: Top-Down Attention Framework for Vision Tasks [46.14128665926765]
We propose the Top-Down Attention Framework (TDAF) to capture top-down attentions.
Empirical evidence shows that our TDF can capture effective stratified attention information and boost performance.
arXiv Detail & Related papers (2020-12-14T04:19:13Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.