Taming SAM for Underwater Instance Segmentation and Beyond
- URL: http://arxiv.org/abs/2505.15581v2
- Date: Mon, 04 Aug 2025 03:39:01 GMT
- Title: Taming SAM for Underwater Instance Segmentation and Beyond
- Authors: Hua Li, Shijie Lian, Zhiyuan Li, Runmin Cong, Chongyi Li,
- Abstract summary: We propose a large-scale underwater instance segmentation dataset, UIIS10K, which includes 10,048 images with pixel-level annotations for 10 categories.<n>We then introduce UWSAM, an efficient model designed for automatic and accurate segmentation of underwater instances.<n>We show that our model is effective, achieving significant performance improvements over state-of-the-art methods on multiple underwater instance datasets.
- Score: 40.5289139779741
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: With recent breakthroughs in large-scale modeling, the Segment Anything Model (SAM) has demonstrated significant potential in a variety of visual applications. However, due to the lack of underwater domain expertise, SAM and its variants face performance limitations in end-to-end underwater instance segmentation tasks, while their higher computational requirements further hinder their application in underwater scenarios. To address this challenge, we propose a large-scale underwater instance segmentation dataset, UIIS10K, which includes 10,048 images with pixel-level annotations for 10 categories. Then, we introduce UWSAM, an efficient model designed for automatic and accurate segmentation of underwater instances. UWSAM efficiently distills knowledge from the SAM ViT-Huge image encoder into the smaller ViT-Small image encoder via the Mask GAT-based Underwater Knowledge Distillation (MG-UKD) method for effective visual representation learning. Furthermore, we design an End-to-end Underwater Prompt Generator (EUPG) for UWSAM, which automatically generates underwater prompts instead of explicitly providing foreground points or boxes as prompts, thus enabling the network to locate underwater instances accurately for efficient segmentation. Comprehensive experimental results show that our model is effective, achieving significant performance improvements over state-of-the-art methods on multiple underwater instance datasets. Datasets and codes are available at https://github.com/LiamLian0727/UIIS10K.
Related papers
- UIS-Mamba: Exploring Mamba for Underwater Instance Segmentation via Dynamic Tree Scan and Hidden State Weaken [57.812799861886305]
Mamba is an emerging state space model with inherently linear complexity and global receptive fields.<n>We propose the first Mamba-based underwater instance segmentation model UIS-Mamba, and design two innovative modules, Dynamic Tree Scan (DTS) and Hidden State Weaken (HSW)<n>DTS module maintains the continuity of the internal features of the instance objects by allowing the patches to dynamically offset and scale.<n>HSW module suppresses the interference of complex backgrounds and effectively focuses the information flow of state propagation to the instances themselves.
arXiv Detail & Related papers (2025-08-01T08:21:24Z) - Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments.
We propose UOIS-SAM, a data-efficient solution for the UOIS task.
UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z) - Evaluation of Segment Anything Model 2: The Role of SAM2 in the Underwater Environment [2.0554501265326794]
The Segment Anything Model (SAM) and its extensions have been attempted for applications in various underwater visualization tasks in marine sciences.
Recently, Meta has developed the Segment Anything Model 2 (SAM2), which significantly improves running speed and segmentation accuracy.
This report aims to explore the potential of SAM2 in marine science by evaluating it on the underwater instance segmentation datasets benchmark UIIS and USIS10K.
arXiv Detail & Related papers (2024-08-06T03:20:10Z) - Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset [60.14089302022989]
Underwater vision tasks often suffer from low segmentation accuracy due to the complex underwater circumstances.
We construct the first large-scale underwater salient instance segmentation dataset (USIS10K)
We propose an Underwater Salient Instance architecture based on Segment Anything Model (USIS-SAM) specifically for the underwater domain.
arXiv Detail & Related papers (2024-06-10T06:17:33Z) - MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation.
Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z) - Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM [62.85895749882285]
Marine Animal (MAS) involves segmenting animals within marine environments.
We propose a novel feature learning framework, named Dual-SAM for high-performance MAS.
Our proposed method achieves state-of-the-art performances on five widely-used MAS datasets.
arXiv Detail & Related papers (2024-04-07T15:34:40Z) - AquaSAM: Underwater Image Foreground Segmentation [1.7482936568887284]
This work presents AquaSAM, the first attempt to extend the success of SAM on underwater images.
We develop a straightforward fine-tuning method to adapt SAM to general foreground underwater image segmentation.
We demonstrate that AquaSAM outperforms the default SAM model especially at hard tasks like coral reefs.
arXiv Detail & Related papers (2023-08-08T12:30:36Z) - SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater
Robots [16.242924916178282]
This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots.
Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images.
arXiv Detail & Related papers (2020-11-12T08:17:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.