SLENet: A Guidance-Enhanced Network for Underwater Camouflaged Object Detection
- URL: http://arxiv.org/abs/2509.03786v2
- Date: Fri, 05 Sep 2025 01:54:11 GMT
- Title: SLENet: A Guidance-Enhanced Network for Underwater Camouflaged Object Detection
- Authors: Xinxin Huang, Han Sun, Ningzhong Liu, Huiyu Zhou, Yinan Yao,
- Abstract summary: We introduce the UCOD task and present DeepCamo, a benchmark dataset designed for this domain.<n>We also propose Semantic localization and Enhancement Network (SLENet), a novel framework for UCOD.<n> Experiments on our DeepCamo dataset and three benchmark COD datasets confirm SLENet's superior performance over SOTA methods.
- Score: 22.78768403870293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Underwater Camouflaged Object Detection (UCOD) aims to identify objects that blend seamlessly into underwater environments. This task is critically important to marine ecology. However, it remains largely underexplored and accurate identification is severely hindered by optical distortions, water turbidity, and the complex traits of marine organisms. To address these challenges, we introduce the UCOD task and present DeepCamo, a benchmark dataset designed for this domain. We also propose Semantic Localization and Enhancement Network (SLENet), a novel framework for UCOD. We first benchmark state-of-the-art COD models on DeepCamo to reveal key issues, upon which SLENet is built. In particular, we incorporate Gamma-Asymmetric Enhancement (GAE) module and a Localization Guidance Branch (LGB) to enhance multi-scale feature representation while generating a location map enriched with global semantic information. This map guides the Multi-Scale Supervised Decoder (MSSD) to produce more accurate predictions. Experiments on our DeepCamo dataset and three benchmark COD datasets confirm SLENet's superior performance over SOTA methods, and underscore its high generality for the broader COD task.
Related papers
- High-Resolution Underwater Camouflaged Object Detection: GBU-UCOD Dataset and Topology-Aware and Frequency-Decoupled Networks [32.76569239634241]
We propose a novel framework that integrates topology-aware modeling with frequency-decoupled perception.<n>DeepTopo-Net achieves state-of-the-art performance, particularly in preserving morphological integrity of complex underwater patterns.
arXiv Detail & Related papers (2026-02-03T14:41:27Z) - DGA-Net: Enhancing SAM with Depth Prompting and Graph-Anchor Guidance for Camouflaged Object Detection [28.171222957310956]
We present DGA-Net, a framework that adapts the Segment Anything Model (SAM) via a novel depth prompting" paradigm.<n>Specifically, we propose a Cross-modal Graph Enhancement (CGE) module that synthesizes RGB semantics and geometric depth within a heterogeneous graph.<n>We also design an Anchor-Guided Refinement (AGR) module to counteract the inherent information decay in feature hierarchies.
arXiv Detail & Related papers (2026-01-06T09:04:23Z) - FOM-Nav: Frontier-Object Maps for Object Goal Navigation [65.76906445210112]
FOM-Nav is a framework that enhances exploration efficiency through Frontier-Object Maps and vision-language models.<n>To train FOM-Nav, we automatically construct large-scale navigation datasets from real-world scanned environments.<n> FOM-Nav achieves state-of-the-art performance on the MP3D and HM3D benchmarks, particularly in navigation efficiency metric SPL.
arXiv Detail & Related papers (2025-11-30T18:16:09Z) - Expose Camouflage in the Water: Underwater Camouflaged Instance Segmentation and Dataset [76.92197418745822]
camouflaged instance segmentation (CIS) faces greater challenges in accurately segmenting objects that blend closely with their surroundings.<n>Traditional camouflaged instance segmentation methods, trained on terrestrial-dominated datasets with limited underwater samples, may exhibit inadequate performance in underwater scenes.<n>We introduce the first underwater camouflaged instance segmentation dataset, UCIS4K, which comprises 3,953 images of camouflaged marine organisms with instance-level annotations.
arXiv Detail & Related papers (2025-10-20T14:34:51Z) - VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z) - MID: A Comprehensive Shore-Based Dataset for Multi-Scale Dense Ship Occlusion and Interaction Scenarios [10.748210940033484]
The Maritime Ship Navigation Behavior dataset (MID) is designed to address challenges in ship detection within complex maritime environments.<n>MID contains 5,673 images with 135,884 finely annotated target instances, supporting both supervised and semi-supervised learning.<n>MID's images are sourced from high-definition video clips of real-world navigation across 43 water areas, with varied weather and lighting conditions.
arXiv Detail & Related papers (2024-12-08T09:34:23Z) - PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network [24.54269823691119]
We present an advanced study on more challenging high-resolution salient object detection (HRSOD) from both dataset and network framework perspectives.
To compensate for the lack of HRSOD dataset, we thoughtfully collect a large-scale high resolution salient object detection dataset, called UHRSD.
All the images are finely annotated in pixel-level, far exceeding previous low-resolution SOD datasets.
arXiv Detail & Related papers (2024-08-02T09:31:21Z) - Frequency Perception Network for Camouflaged Object Detection [51.26386921922031]
We propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain.<n>Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage.<n>Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets.
arXiv Detail & Related papers (2023-08-17T11:30:46Z) - DepthFormer: Exploiting Long-Range Correlation and Local Information for
Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation.
We propose to leverage the Transformer to model this global context with an effective attention mechanism.
Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z) - Fast Camouflaged Object Detection via Edge-based Reversible
Re-calibration Network [17.538512222905087]
This paper proposes a novel edge-based reversible re-calibration network called ERRNet.
Our model is characterized by two innovative designs, namely Selective Edge Aggregation (SEA) and Reversible Re-calibration Unit (RRU)
Experimental results show that ERRNet outperforms existing cutting-edge baselines on three COD datasets and five medical image segmentation datasets.
arXiv Detail & Related papers (2021-11-05T02:03:54Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - Dense Attention Fluid Network for Salient Object Detection in Optical
Remote Sensing Images [193.77450545067967]
We propose an end-to-end Dense Attention Fluid Network (DAFNet) for salient object detection in optical remote sensing images (RSIs)
A Global Context-aware Attention (GCA) module is proposed to adaptively capture long-range semantic context relationships.
We construct a new and challenging optical RSI dataset for SOD that contains 2,000 images with pixel-wise saliency annotations.
arXiv Detail & Related papers (2020-11-26T06:14:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.