Related papers: SPMamba-YOLO: An Underwater Object Detection Network Based on Multi-Scale Feature Enhancement and Global Context Modeling

SPMamba-YOLO: An Underwater Object Detection Network Based on Multi-Scale Feature Enhancement and Global Context Modeling

URL: http://arxiv.org/abs/2602.22674v1
Date: Thu, 26 Feb 2026 06:45:11 GMT
Title: SPMamba-YOLO: An Underwater Object Detection Network Based on Multi-Scale Feature Enhancement and Global Context Modeling
Authors: Guanghao Liao, Zhen Liu, Liyuan Cao, Yonghui Yang, Qi Li,
Abstract summary: We propose a novel underwater object detection network that integrates multi-scale feature enhancement with global context modeling.<n>Experiments on the URPC2022 dataset demonstrate that the network outperforms the YOLOv8n baseline by more than 4.9% in mAP@0.5.
Score: 12.390389688362506
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Underwater object detection is a critical yet challenging research problem owing to severe light attenuation, color distortion, background clutter, and the small scale of underwater targets. To address these challenges, we propose SPMamba-YOLO, a novel underwater object detection network that integrates multi-scale feature enhancement with global context modeling. Specifically, a Spatial Pyramid Pooling Enhanced Layer Aggregation Network (SPPELAN) module is introduced to strengthen multi-scale feature aggregation and expand the receptive field, while a Pyramid Split Attention (PSA) mechanism enhances feature discrimination by emphasizing informative regions and suppressing background interference. In addition, a Mamba-based state space modeling module is incorporated to efficiently capture long-range dependencies and global contextual information, thereby improving detection robustness in complex underwater environments. Extensive experiments on the URPC2022 dataset demonstrate that SPMamba-YOLO outperforms the YOLOv8n baseline by more than 4.9\% in mAP@0.5, particularly for small and densely distributed underwater objects, while maintaining a favorable balance between detection accuracy and computational cost.

Related papers

Small Object Detection in Complex Backgrounds with Multi-Scale Attention and Global Relation Modeling [8.24377869183113]
Small object detection under complex backgrounds is a challenging task due to severe feature degradation, weak semantic representation, and inaccurate localization.<n>Existing detection frameworks are mainly designed for general objects.<n>We propose a multi-level feature enhancement and global relation modeling framework tailored for small object detection.
arXiv Detail & Related papers (2026-03-04T06:57:46Z)
An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research [0.0]
Traditional sea exploration faces significant challenges due to extreme conditions, limited visibility, and high costs.<n>This paper presents an innovative AI-powered Autonomous Underwater Vehicle (AUV) system designed to overcome these limitations.<n>The system integrates YOLOv12 Nano for real-time object detection, a Convolutional Neural Network (CNN) (ResNet50) for feature extraction, and K-Means++ clustering for grouping marine objects.
arXiv Detail & Related papers (2025-12-08T15:45:40Z)
MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection [12.838872442435527]
Small object detection in UAV imagery is crucial for applications such as search-and-rescue, traffic monitoring, and environmental surveillance.<n>Existing multi-scale fusion methods help, but add computational burden and blur fine details.<n>We propose a unified fusion framework that tightly couples global context with local detail to boost detection performance.
arXiv Detail & Related papers (2025-06-15T02:54:25Z)
VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z)
ACMamba: Fast Unsupervised Anomaly Detection via An Asymmetrical Consensus State Space Model [51.83639270669481]
Unsupervised anomaly detection in hyperspectral images (HSI) aims to detect unknown targets from backgrounds.<n>HSI studies are hindered by steep computational costs due to the high-dimensional property of HSI and dense sampling-based training paradigm.<n>We propose an Asymmetrical Consensus State Space Model (ACMamba) to significantly reduce computational costs without compromising accuracy.
arXiv Detail & Related papers (2025-04-16T05:33:42Z)
MSCA-Net:Multi-Scale Context Aggregation Network for Infrared Small Target Detection [0.1759252234439348]
This paper proposes a network architecture named MSCA-Net, which integrates three key components.<n>MSEDA employs a multi-scale feature fusion attention mechanism to adaptively aggregate information across different scales.<n>PCBAM captures the correlation between global and local features through a correlation matrix-based strategy.<n> CAB enhances the representation of critical features by assigning greater weights to them, integrating both low-level and high-level information.
arXiv Detail & Related papers (2025-03-21T14:42:31Z)
PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection. We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN) PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z)
Small Object Detection via Coarse-to-fine Proposal Generation and Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning. CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z)
Dense Attention Fluid Network for Salient Object Detection in Optical Remote Sensing Images [193.77450545067967]
We propose an end-to-end Dense Attention Fluid Network (DAFNet) for salient object detection in optical remote sensing images (RSIs) A Global Context-aware Attention (GCA) module is proposed to adaptively capture long-range semantic context relationships. We construct a new and challenging optical RSI dataset for SOD that contains 2,000 images with pixel-wise saliency annotations.
arXiv Detail & Related papers (2020-11-26T06:14:10Z)
Salient Object Detection Combining a Self-attention Module and a Feature Pyramid Network [10.81245352773775]
We propose a novel pyramid self-attention module (PSAM) and the adoption of an independent feature-complementing strategy. In PSAM, self-attention layers are equipped after multi-scale pyramid features to capture richer high-level features and bring larger receptive fields to the model.
arXiv Detail & Related papers (2020-04-30T03:08:34Z)
Global Context-Aware Progressive Aggregation Network for Salient Object Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features. We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.