SPMamba-YOLO: An Underwater Object Detection Network Based on Multi-Scale Feature Enhancement and Global Context Modeling
- URL: http://arxiv.org/abs/2602.22674v1
- Date: Thu, 26 Feb 2026 06:45:11 GMT
- Title: SPMamba-YOLO: An Underwater Object Detection Network Based on Multi-Scale Feature Enhancement and Global Context Modeling
- Authors: Guanghao Liao, Zhen Liu, Liyuan Cao, Yonghui Yang, Qi Li,
- Abstract summary: We propose a novel underwater object detection network that integrates multi-scale feature enhancement with global context modeling.<n>Experiments on the URPC2022 dataset demonstrate that the network outperforms the YOLOv8n baseline by more than 4.9% in mAP@0.5.
- Score: 12.390389688362506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Underwater object detection is a critical yet challenging research problem owing to severe light attenuation, color distortion, background clutter, and the small scale of underwater targets. To address these challenges, we propose SPMamba-YOLO, a novel underwater object detection network that integrates multi-scale feature enhancement with global context modeling. Specifically, a Spatial Pyramid Pooling Enhanced Layer Aggregation Network (SPPELAN) module is introduced to strengthen multi-scale feature aggregation and expand the receptive field, while a Pyramid Split Attention (PSA) mechanism enhances feature discrimination by emphasizing informative regions and suppressing background interference. In addition, a Mamba-based state space modeling module is incorporated to efficiently capture long-range dependencies and global contextual information, thereby improving detection robustness in complex underwater environments. Extensive experiments on the URPC2022 dataset demonstrate that SPMamba-YOLO outperforms the YOLOv8n baseline by more than 4.9\% in mAP@0.5, particularly for small and densely distributed underwater objects, while maintaining a favorable balance between detection accuracy and computational cost.
Related papers
- Small Object Detection in Complex Backgrounds with Multi-Scale Attention and Global Relation Modeling [8.24377869183113]
Small object detection under complex backgrounds is a challenging task due to severe feature degradation, weak semantic representation, and inaccurate localization.<n>Existing detection frameworks are mainly designed for general objects.<n>We propose a multi-level feature enhancement and global relation modeling framework tailored for small object detection.
arXiv Detail & Related papers (2026-03-04T06:57:46Z) - An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research [0.0]
Traditional sea exploration faces significant challenges due to extreme conditions, limited visibility, and high costs.<n>This paper presents an innovative AI-powered Autonomous Underwater Vehicle (AUV) system designed to overcome these limitations.<n>The system integrates YOLOv12 Nano for real-time object detection, a Convolutional Neural Network (CNN) (ResNet50) for feature extraction, and K-Means++ clustering for grouping marine objects.
arXiv Detail & Related papers (2025-12-08T15:45:40Z) - MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection [12.838872442435527]
Small object detection in UAV imagery is crucial for applications such as search-and-rescue, traffic monitoring, and environmental surveillance.<n>Existing multi-scale fusion methods help, but add computational burden and blur fine details.<n>We propose a unified fusion framework that tightly couples global context with local detail to boost detection performance.
arXiv Detail & Related papers (2025-06-15T02:54:25Z) - VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z) - ACMamba: Fast Unsupervised Anomaly Detection via An Asymmetrical Consensus State Space Model [51.83639270669481]
Unsupervised anomaly detection in hyperspectral images (HSI) aims to detect unknown targets from backgrounds.<n>HSI studies are hindered by steep computational costs due to the high-dimensional property of HSI and dense sampling-based training paradigm.<n>We propose an Asymmetrical Consensus State Space Model (ACMamba) to significantly reduce computational costs without compromising accuracy.
arXiv Detail & Related papers (2025-04-16T05:33:42Z) - MSCA-Net:Multi-Scale Context Aggregation Network for Infrared Small Target Detection [0.1759252234439348]
This paper proposes a network architecture named MSCA-Net, which integrates three key components.<n>MSEDA employs a multi-scale feature fusion attention mechanism to adaptively aggregate information across different scales.<n>PCBAM captures the correlation between global and local features through a correlation matrix-based strategy.<n> CAB enhances the representation of critical features by assigning greater weights to them, integrating both low-level and high-level information.
arXiv Detail & Related papers (2025-03-21T14:42:31Z) - PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Dense Attention Fluid Network for Salient Object Detection in Optical
Remote Sensing Images [193.77450545067967]
We propose an end-to-end Dense Attention Fluid Network (DAFNet) for salient object detection in optical remote sensing images (RSIs)
A Global Context-aware Attention (GCA) module is proposed to adaptively capture long-range semantic context relationships.
We construct a new and challenging optical RSI dataset for SOD that contains 2,000 images with pixel-wise saliency annotations.
arXiv Detail & Related papers (2020-11-26T06:14:10Z) - Salient Object Detection Combining a Self-attention Module and a Feature
Pyramid Network [10.81245352773775]
We propose a novel pyramid self-attention module (PSAM) and the adoption of an independent feature-complementing strategy.
In PSAM, self-attention layers are equipped after multi-scale pyramid features to capture richer high-level features and bring larger receptive fields to the model.
arXiv Detail & Related papers (2020-04-30T03:08:34Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.