Related papers: High-Resolution Underwater Camouflaged Object Detection: GBU-UCOD Dataset and Topology-Aware and Frequency-Decoupled Networks

High-Resolution Underwater Camouflaged Object Detection: GBU-UCOD Dataset and Topology-Aware and Frequency-Decoupled Networks

URL: http://arxiv.org/abs/2602.03591v1
Date: Tue, 03 Feb 2026 14:41:27 GMT
Title: High-Resolution Underwater Camouflaged Object Detection: GBU-UCOD Dataset and Topology-Aware and Frequency-Decoupled Networks
Authors: Wenji Wu, Shuo Ye, Yiyu Liu, Jiguang He, Zhuo Wang, Zitong Yu,
Abstract summary: We propose a novel framework that integrates topology-aware modeling with frequency-decoupled perception.<n>DeepTopo-Net achieves state-of-the-art performance, particularly in preserving morphological integrity of complex underwater patterns.
Score: 32.76569239634241
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Underwater Camouflaged Object Detection (UCOD) is a challenging task due to the extreme visual similarity between targets and backgrounds across varying marine depths. Existing methods often struggle with topological fragmentation of slender creatures in the deep sea and the subtle feature extraction of transparent organisms. In this paper, we propose DeepTopo-Net, a novel framework that integrates topology-aware modeling with frequency-decoupled perception. To address physical degradation, we design the Water-Conditioned Adaptive Perceptor (WCAP), which employs Riemannian metric tensors to dynamically deform convolutional sampling fields. Furthermore, the Abyssal-Topology Refinement Module (ATRM) is developed to maintain the structural connectivity of spindly targets through skeletal priors. Specifically, we first introduce GBU-UCOD, the first high-resolution (2K) benchmark tailored for marine vertical zonation, filling the data gap for hadal and abyssal zones. Extensive experiments on MAS3K, RMAS, and our proposed GBU-UCOD datasets demonstrate that DeepTopo-Net achieves state-of-the-art performance, particularly in preserving the morphological integrity of complex underwater patterns. The datasets and codes will be released at https://github.com/Wuwenji18/GBU-UCOD.

Related papers

SPMamba-YOLO: An Underwater Object Detection Network Based on Multi-Scale Feature Enhancement and Global Context Modeling [12.390389688362506]
We propose a novel underwater object detection network that integrates multi-scale feature enhancement with global context modeling.<n>Experiments on the URPC2022 dataset demonstrate that the network outperforms the YOLOv8n baseline by more than 4.9% in mAP@0.5.
arXiv Detail & Related papers (2026-02-26T06:45:11Z)
Dynamic Topology Awareness: Breaking the Granularity Rigidity in Vision-Language Navigation [22.876516699004814]
Vision-Language Navigation in Continuous Environments (VLN-CE) presents a core challenge: grounding high-level linguistic instructions into precise, safe, and long-horizon spatial actions.<n>Explicit topological maps have proven to be a vital solution for providing robust spatial memory in such tasks.<n>Existing topological planning methods suffer from a "Granularity Rigidity" problem.<n>We propose DGNav, a framework for Dynamic Topological Navigation, introducing a context-aware mechanism to modulate map density and connectivity on-the-fly.
arXiv Detail & Related papers (2026-01-29T14:06:23Z)
UDPNet: Unleashing Depth-based Priors for Robust Image Dehazing [77.10640210751981]
UDPNet is a general framework that leverages depth-based priors from a large-scale pretrained depth estimation model DepthAnything V2.<n>Our proposed solution establishes a new benchmark for depth-aware dehazing across various scenarios.
arXiv Detail & Related papers (2026-01-11T13:29:02Z)
Expose Camouflage in the Water: Underwater Camouflaged Instance Segmentation and Dataset [76.92197418745822]
camouflaged instance segmentation (CIS) faces greater challenges in accurately segmenting objects that blend closely with their surroundings.<n>Traditional camouflaged instance segmentation methods, trained on terrestrial-dominated datasets with limited underwater samples, may exhibit inadequate performance in underwater scenes.<n>We introduce the first underwater camouflaged instance segmentation dataset, UCIS4K, which comprises 3,953 images of camouflaged marine organisms with instance-level annotations.
arXiv Detail & Related papers (2025-10-20T14:34:51Z)
APGNet: Adaptive Prior-Guided for Underwater Camouflaged Object Detection [22.097955383220143]
We propose an Adaptive Prior-Guided Network (APGNet) to detect camouflaged objects in underwater environments.<n>APGNet integrates a Siamese architecture with a novel prior-guided mechanism to enhance robustness and detection accuracy.<n>Our proposed method APGNet outperforms 15 state-of-art methods under widely used evaluation metrics.
arXiv Detail & Related papers (2025-10-14T01:51:44Z)
SLENet: A Guidance-Enhanced Network for Underwater Camouflaged Object Detection [22.78768403870293]
We introduce the UCOD task and present DeepCamo, a benchmark dataset designed for this domain.<n>We also propose Semantic localization and Enhancement Network (SLENet), a novel framework for UCOD.<n> Experiments on our DeepCamo dataset and three benchmark COD datasets confirm SLENet's superior performance over SOTA methods.
arXiv Detail & Related papers (2025-09-04T00:44:32Z)
Underwater Monocular Metric Depth Estimation: Real-World Benchmarks and Synthetic Fine-Tuning with Vision Foundation Models [0.0]
We present a benchmark of zero-shot and fine-tuned monocular metric depth estimation models on real-world underwater datasets.<n>Our results show that large-scale models trained on terrestrial data (real or synthetic) are effective in in-air settings, but perform poorly underwater.<n>This study presents a detailed evaluation and visualization of monocular metric depth estimation in underwater scenes.
arXiv Detail & Related papers (2025-07-02T21:06:39Z)
VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z)
Frequency Perception Network for Camouflaged Object Detection [51.26386921922031]
We propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain.<n>Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage.<n>Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets.
arXiv Detail & Related papers (2023-08-17T11:30:46Z)
DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation. We propose to leverage the Transformer to model this global context with an effective attention mechanism. Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z)
Dense Attention Fluid Network for Salient Object Detection in Optical Remote Sensing Images [193.77450545067967]
We propose an end-to-end Dense Attention Fluid Network (DAFNet) for salient object detection in optical remote sensing images (RSIs) A Global Context-aware Attention (GCA) module is proposed to adaptively capture long-range semantic context relationships. We construct a new and challenging optical RSI dataset for SOD that contains 2,000 images with pixel-wise saliency annotations.
arXiv Detail & Related papers (2020-11-26T06:14:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.