Related papers: Adaptive Enhancement and Dual-Pooling Sequential Attention for Lightweight Underwater Object Detection with YOLOv10

Adaptive Enhancement and Dual-Pooling Sequential Attention for Lightweight Underwater Object Detection with YOLOv10

URL: http://arxiv.org/abs/2603.03807v1
Date: Wed, 04 Mar 2026 07:39:57 GMT
Title: Adaptive Enhancement and Dual-Pooling Sequential Attention for Lightweight Underwater Object Detection with YOLOv10
Authors: Md. Mushibur Rahman, Umme Fawzia Rahim, Enam Ahmed Taufik,
Abstract summary: This manuscript introduces a streamlined yet robust framework for underwater object detection, grounded in the YOLOv10 architecture.<n>The proposed method integrates a Multi-Stage Adaptive Enhancement module to improve image quality and a Dual-Pooling Sequential Attention mechanism to strengthen multi-scale feature representation.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Underwater object detection constitutes a pivotal endeavor within the realms of marine surveillance and autonomous underwater systems; however, it presents significant challenges due to pronounced visual impairments arising from phenomena such as light absorption, scattering, and diminished contrast. In response to these formidable challenges, this manuscript introduces a streamlined yet robust framework for underwater object detection, grounded in the YOLOv10 architecture. The proposed method integrates a Multi-Stage Adaptive Enhancement module to improve image quality, a Dual-Pooling Sequential Attention (DPSA) mechanism embedded into the backbone to strengthen multi-scale feature representation, and a Focal Generalized IoU Objectness (FGIoU) loss to jointly improve localization accuracy and objectness prediction under class imbalance. Comprehensive experimental evaluations conducted on the RUOD and DUO benchmark datasets substantiate that the proposed DPSA_FGIoU_YOLOv10n attains exceptional performance, achieving mean Average Precision (mAP) scores of 88.9% and 88.0% at IoU threshold 0.5, respectively. In comparison to the baseline YOLOv10n, this represents enhancements of 6.7% for RUOD and 6.2% for DUO, all while preserving a compact model architecture comprising merely 2.8M parameters. These findings validate that the proposed framework establishes an efficacious equilibrium among accuracy, robustness, and real-time operational efficiency, making it suitable for deployment in resource-constrained underwater settings.

Related papers

Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage [65.51149575007149]
We present Fun-DDPS, a generative framework that combines function-space diffusion models with differentiable neural operator surrogates for both forward and inverse modeling.<n>Fun-DDPS produces physically consistent realizations free from the high-frequency artifacts observed in joint-state baselines.
arXiv Detail & Related papers (2026-02-12T18:58:12Z)
YOLO-DS: Fine-Grained Feature Decoupling via Dual-Statistic Synergy Operator for Object Detection [55.58092342624062]
We propose YOLO-DS, a framework built around a novel Dual-Statistic Synergy Operator (DSO)<n>YOLO-DS decouples object features by jointly modeling the channel-wise mean and the peak-to-mean difference.<n>On the MS-COCO benchmark, YOLO-DS consistently outperforms YOLOv8 across five model scales.
arXiv Detail & Related papers (2026-01-26T05:50:32Z)
HAD: Hierarchical Asymmetric Distillation to Bridge Spatio-Temporal Gaps in Event-Based Object Tracking [80.07224739976911]
Event cameras offer exceptional temporal resolution and a range (modal)<n> RGB cameras excel at capturing rich texture with high resolution, whereas event cameras offer exceptional temporal resolution and a range (modal)
arXiv Detail & Related papers (2025-10-22T13:15:13Z)
DoRAN: Stabilizing Weight-Decomposed Low-Rank Adaptation via Noise Injection and Auxiliary Networks [47.58150560549918]
Weight-Decomposed Low-Rank Adaptation (DoRA) has been shown to improve both the learning capacity and training stability of the vanilla Low-Rank Adaptation (LoRA) method.<n>We propose DoRAN, a new variant of DoRA designed to further stabilize training and boost the sample efficiency of DoRA.
arXiv Detail & Related papers (2025-10-05T19:27:48Z)
An Empirical Study on the Robustness of YOLO Models for Underwater Object Detection [5.084022830578536]
We present one of the first comprehensive evaluations of recent YOLO variants (YOLOv8-YOLOv12) across six simulated underwater environments.<n>Our findings show that YOLOv12 delivers the strongest overall performance but is highly vulnerable to noise.<n>Experiments revealed that image counts and instance frequency primarily drive detection performance, while object appearance exerts only a secondary influence.
arXiv Detail & Related papers (2025-09-22T10:55:21Z)
AquaFeat: A Features-Based Image Enhancement Model for Underwater Object Detection [0.0]
We propose AquaFeat, a novel, plug-and-play module that performs task-driven feature enhancement.<n>Our approach integrates a multi-scale feature enhancement network trained end-to-end with the detector's loss function.<n>When integrated with YOLOv8m on challenging underwater datasets, AquaFeat achieves state-of-the-art Precision (0.877) and Recall (0.624), along with competitive mAP scores (mAP@0.5 of 0.677 and mAP@[0.5:0.95] of 0.421)
arXiv Detail & Related papers (2025-08-17T12:22:18Z)
Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout [62.73150122809138]
Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.<n>We propose the FedDHAD FL framework, which comes with two novel methods: Dynamic Heterogeneous model aggregation (FedDH) and Adaptive Dropout (FedAD)<n>The combination of these two methods makes FedDHAD significantly outperform state-of-the-art solutions in terms of accuracy (up to 6.7% higher), efficiency (up to 2.02 times faster), and cost (up to 15.0% smaller)
arXiv Detail & Related papers (2025-07-14T16:19:00Z)
Improve Underwater Object Detection through YOLOv12 Architecture and Physics-informed Augmentation [0.20767168898581637]
Underwater object detection is crucial for autonomous navigation, environmental monitoring, and marine exploration.<n>Current methods balance accuracy and computational efficiency, but they have trouble deploying in real-time under low visibility conditions.<n>This study advances underwater detection through the integration of physics-informed augmentation techniques with the YOLOv12 architecture.
arXiv Detail & Related papers (2025-06-30T04:06:50Z)
EPBC-YOLOv8: An efficient and accurate improved YOLOv8 underwater detector based on an attention mechanism [4.081096260595706]
We enhance underwater target detection by integrating channel and spatial attention into YOLOv8's backbone.<n>Our framework addresses underwater image degradation, achieving mAP at 0.5 scores of 76.7 percent and 79.0 percent on datasets.<n>These scores are 2.3 percent and 0.7 percent higher than the original YOLOv8, showcasing enhanced precision in detecting marine organisms.
arXiv Detail & Related papers (2025-02-09T06:09:56Z)
Source-Free Domain Adaptive Object Detection with Semantics Compensation [54.00183496587841]
We introduce Weak-to-strong Semantics Compensation (WSCo) for strong data augmentation.<n>WSCo compensates for the class-relevant semantics that may be lost during strong augmentation on the fly.<n>WSCo can be implemented as a generic plug-in, easily integrable with any existing SFOD pipelines.
arXiv Detail & Related papers (2024-10-07T23:32:06Z)
A Computer Vision Enabled damage detection model with improved YOLOv5 based on Transformer Prediction Head [0.0]
Current state-of-the-art deep learning (DL)-based damage detection models often lack superior feature extraction capability in complex and noisy environments. DenseSPH-YOLOv5 is a real-time DL-based high-performance damage detection model where DenseNet blocks have been integrated with the backbone. DenseSPH-YOLOv5 obtains a mean average precision (mAP) value of 85.25 %, F1-score of 81.18 %, and precision (P) value of 89.51 % outperforming current state-of-the-art models.
arXiv Detail & Related papers (2023-03-07T22:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.