RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation
based on Visual Foundation Model
- URL: http://arxiv.org/abs/2306.16269v2
- Date: Wed, 29 Nov 2023 12:47:59 GMT
- Title: RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation
based on Visual Foundation Model
- Authors: Keyan Chen, Chenyang Liu, Hao Chen, Haotian Zhang, Wenyuan Li,
Zhengxia Zou, and Zhenwei Shi
- Abstract summary: We propose a method to learn the generation of appropriate prompts for Segment Anything Model (SAM)
This enables SAM to produce semantically discernible segmentation results for remote sensing images.
We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter.
- Score: 29.42043345787285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging the extensive training data from SA-1B, the Segment Anything Model
(SAM) demonstrates remarkable generalization and zero-shot capabilities.
However, as a category-agnostic instance segmentation method, SAM heavily
relies on prior manual guidance, including points, boxes, and coarse-grained
masks. Furthermore, its performance in remote sensing image segmentation tasks
remains largely unexplored and unproven. In this paper, we aim to develop an
automated instance segmentation approach for remote sensing images, based on
the foundational SAM model and incorporating semantic category information.
Drawing inspiration from prompt learning, we propose a method to learn the
generation of appropriate prompts for SAM. This enables SAM to produce
semantically discernible segmentation results for remote sensing images, a
concept we have termed RSPrompter. We also propose several ongoing derivatives
for instance segmentation tasks, drawing on recent advancements within the SAM
community, and compare their performance with RSPrompter. Extensive
experimental results, derived from the WHU building, NWPU VHR-10, and SSDD
datasets, validate the effectiveness of our proposed method. The code for our
method is publicly available at kychen.me/RSPrompter.
Related papers
- Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation [4.6570959687411975]
The Segment Anything Model (SAM) demonstrates exceptional generalization capabilities.
SAM's lack of pretraining on massive remote sensing images and its interactive structure limit its automatic mask prediction capabilities.
A Multi- cognitive SAM-Based Instance Model (MC-SAM SEG) is introduced to employ SAM on remote sensing domain.
The proposed method named MC-SAM SEG extracts high-quality features by fine-tuning the SAM-Mona encoder along with a feature aggregator.
arXiv Detail & Related papers (2024-08-16T07:23:22Z) - AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts.
We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z) - RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for
Remote Sensing Image Semantic Segmentation [10.37240769959699]
Segment Anything Model (SAM) provides a universal pre-training model for image segmentation tasks.
We propose RSAM-Seg, which stands for Remote Sensing SAM with Semantic, as a tailored modification of SAM for the remote sensing field.
Adapter-Scale, a set of supplementary scaling modules, are proposed in the multi-head attention blocks of the encoder part of SAM.
Experiments are conducted on four distinct remote sensing scenarios, encompassing cloud detection, field monitoring, building detection and road mapping tasks.
arXiv Detail & Related papers (2024-02-29T09:55:46Z) - Learning to Prompt Segment Anything Models [55.805816693815835]
Segment Anything Models (SAMs) have demonstrated great potential in learning to segment anything.
SAMs work with two types of prompts including spatial prompts (e.g., points) and semantic prompts (e.g., texts)
We propose spatial-semantic prompt learning (SSPrompt) that learns effective semantic and spatial prompts for better SAMs.
arXiv Detail & Related papers (2024-01-09T16:24:25Z) - Boosting Segment Anything Model Towards Open-Vocabulary Learning [69.42565443181017]
Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model.
Despite SAM finding applications and adaptations in various domains, its primary limitation lies in the inability to grasp object semantics.
We present Sambor to seamlessly integrate SAM with the open-vocabulary object detector in an end-to-end framework.
arXiv Detail & Related papers (2023-12-06T17:19:00Z) - Self-guided Few-shot Semantic Segmentation for Remote Sensing Imagery
Based on Large Vision Models [14.292149307183967]
This research introduces a structured framework designed for the automation of few-shot semantic segmentation.
It utilizes the SAM model and facilitates a more efficient generation of semantically discernible segmentation outcomes.
Central to our methodology is a novel automatic prompt learning approach, leveraging prior guided masks to produce coarse pixel-wise prompts for SAM.
arXiv Detail & Related papers (2023-11-22T07:07:55Z) - Semantic-SAM: Segment and Recognize Anything at Any Granularity [83.64686655044765]
We introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
We consolidate multiple datasets across three granularities and introduce decoupled classification for objects and parts.
For the multi-granularity capability, we propose a multi-choice learning scheme during training, enabling each click to generate masks at multiple levels.
arXiv Detail & Related papers (2023-07-10T17:59:40Z) - RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation [53.4319652364256]
This paper presents the RefSAM model, which explores the potential of SAM for referring video object segmentation.
Our proposed approach adapts the original SAM model to enhance cross-modality learning by employing a lightweight Cross-RValModal.
We employ a parameter-efficient tuning strategy to align and fuse the language and vision features effectively.
arXiv Detail & Related papers (2023-07-03T13:21:58Z) - The Segment Anything Model (SAM) for Remote Sensing Applications: From
Zero to One Shot [6.500451285898152]
This study aims to advance the application of the Segment Anything Model (SAM) in remote sensing image analysis.
SAM is known for its exceptional generalization capabilities and zero-shot learning.
Despite the limitations encountered with lower spatial resolution images, SAM exhibits promising adaptability to remote sensing data analysis.
arXiv Detail & Related papers (2023-06-29T01:49:33Z) - Personalize Segment Anything Model with One Shot [52.54453744941516]
We propose a training-free Personalization approach for Segment Anything Model (SAM)
Given only a single image with a reference mask, PerSAM first localizes the target concept by a location prior.
PerSAM segments it within other images or videos via three techniques: target-guided attention, target-semantic prompting, and cascaded post-refinement.
arXiv Detail & Related papers (2023-05-04T17:59:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.