Related papers: Empowering DINO Representations for Underwater Instance Segmentation via Aligner and Prompter

Empowering DINO Representations for Underwater Instance Segmentation via Aligner and Prompter

URL: http://arxiv.org/abs/2511.08334v1
Date: Wed, 12 Nov 2025 01:53:57 GMT
Title: Empowering DINO Representations for Underwater Instance Segmentation via Aligner and Prompter
Authors: Zhiyang Chen, Chen Zhang, Hao Fang, Runmin Cong,
Abstract summary: Underwater instance segmentation (UIS) is a pivotal technology in marine resource exploration and ecological protection.<n>We introduce DiveSeg, a novel framework built upon two insightful components.<n>DiveSeg achieves the state-of-the-art performance on popular UIIS and USIS10K datasets.
Score: 32.30901888033798
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Underwater instance segmentation (UIS), integrating pixel-level understanding and instance-level discrimination, is a pivotal technology in marine resource exploration and ecological protection. In recent years, large-scale pretrained visual foundation models, exemplified by DINO, have advanced rapidly and demonstrated remarkable performance on complex downstream tasks. In this paper, we demonstrate that DINO can serve as an effective feature learner for UIS, and we introduce DiveSeg, a novel framework built upon two insightful components: (1) The AquaStyle Aligner, designed to embed underwater color style features into the DINO fine-tuning process, facilitating better adaptation to the underwater domain. (2) The ObjectPrior Prompter, which incorporates binary segmentation-based prompts to deliver object-level priors, provides essential guidance for instance segmentation task that requires both object- and instance-level reasoning. We conduct thorough experiments on the popular UIIS and USIS10K datasets, and the results show that DiveSeg achieves the state-of-the-art performance. Code: https://github.com/ettof/Diveseg.

Related papers

Exploring the Underwater World Segmentation without Extra Training [55.291219073365546]
We introduce textbfAquaOV255, the first large-scale and fine-grained underwater segmentation dataset.<n>We also present textbfEarth2Ocean, a training-free OV segmentation framework.
arXiv Detail & Related papers (2025-11-11T07:22:56Z)
MARIS: Marine Open-Vocabulary Instance Segmentation with Geometric Enhancement and Semantic Alignment [56.88334234553316]
We introduce textbfMARIS (underlineMarine Open-Vocabulary underlineInstance underlineSegmentation), the first large-scale fine-grained benchmark for underwater Open-Vocabulary (OV) segmentation.<n>Our framework consistently outperforms existing OV baselines both In-Domain and Cross-Domain setting.
arXiv Detail & Related papers (2025-10-17T07:50:58Z)
SparseUWSeg: Active Sparse Point-Label Augmentation for Underwater Semantic Segmentation [5.595626117136082]
We present SparseUWSeg, a novel framework for semantic segmentation.<n>SparseUWSeg employs an active sampling strategy to guide annotators, maximizing the value of their point labels.<n> Experiments on two diverse underwater datasets demonstrate the benefits of SparseUWSeg over state-of-the-art approaches.
arXiv Detail & Related papers (2025-10-11T10:56:48Z)
Advancing Marine Research: UWSAM Framework and UIIS10K Dataset for Precise Underwater Instance Segmentation [110.02397462607449]
We propose a large-scale underwater instance segmentation dataset, UIIS10K, which includes 10,048 images with pixel-level annotations for 10 categories.<n>We then introduce UWSAM, an efficient model designed for automatic and accurate segmentation of underwater instances.<n>We show that our model is effective, achieving significant performance improvements over state-of-the-art methods on multiple underwater instance datasets.
arXiv Detail & Related papers (2025-05-21T14:36:01Z)
FSSUWNet: Mitigating the Fragility of Pre-trained Models with Feature Enhancement for Few-Shot Semantic Segmentation in Underwater Images [4.981558556611925]
Few-Shot Semantic (FSS) has recently progressed in data-scarce domains.<n>We show that the existing FSS methods often struggle to generalize to underwater environments.<n>We propose FSSUWNet, a tailored FSS framework for underwater images with feature enhancement.
arXiv Detail & Related papers (2025-04-01T07:09:15Z)
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset [60.14089302022989]
Underwater vision tasks often suffer from low segmentation accuracy due to the complex underwater circumstances. We construct the first large-scale underwater salient instance segmentation dataset (USIS10K) We propose an Underwater Salient Instance architecture based on Segment Anything Model (USIS-SAM) specifically for the underwater domain.
arXiv Detail & Related papers (2024-06-10T06:17:33Z)
A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement [18.936260846385444]
EnYOLO is an integrated real-time framework designed for simultaneous underwater image enhancement (UIE) and object detection (UOD) Our framework achieves state-of-the-art (SOTA) performance in both UIE and UOD tasks, but also shows superior adaptability when applied to different underwater scenarios.
arXiv Detail & Related papers (2024-03-28T01:00:08Z)
PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators [120.06891448820447]
How to obtain clear and visually pleasant images has become a common concern of people. The task of underwater image enhancement (UIE) has also emerged as the times require. In this paper, we propose a physical model-guided GAN model for UIE, referred to as PUGAN. Our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-06-15T07:41:12Z)
Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues. No human annotations are involved in our framework during the whole training process. Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.