RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation
based on Visual Foundation Model
- URL: http://arxiv.org/abs/2306.16269v2
- Date: Wed, 29 Nov 2023 12:47:59 GMT
- Title: RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation
based on Visual Foundation Model
- Authors: Keyan Chen, Chenyang Liu, Hao Chen, Haotian Zhang, Wenyuan Li,
Zhengxia Zou, and Zhenwei Shi
- Abstract summary: We propose a method to learn the generation of appropriate prompts for Segment Anything Model (SAM)
This enables SAM to produce semantically discernible segmentation results for remote sensing images.
We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter.
- Score: 29.42043345787285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging the extensive training data from SA-1B, the Segment Anything Model
(SAM) demonstrates remarkable generalization and zero-shot capabilities.
However, as a category-agnostic instance segmentation method, SAM heavily
relies on prior manual guidance, including points, boxes, and coarse-grained
masks. Furthermore, its performance in remote sensing image segmentation tasks
remains largely unexplored and unproven. In this paper, we aim to develop an
automated instance segmentation approach for remote sensing images, based on
the foundational SAM model and incorporating semantic category information.
Drawing inspiration from prompt learning, we propose a method to learn the
generation of appropriate prompts for SAM. This enables SAM to produce
semantically discernible segmentation results for remote sensing images, a
concept we have termed RSPrompter. We also propose several ongoing derivatives
for instance segmentation tasks, drawing on recent advancements within the SAM
community, and compare their performance with RSPrompter. Extensive
experimental results, derived from the WHU building, NWPU VHR-10, and SSDD
datasets, validate the effectiveness of our proposed method. The code for our
method is publicly available at kychen.me/RSPrompter.
Related papers
- AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts.
We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z) - PosSAM: Panoptic Open-vocabulary Segment Anything [58.72494640363136]
PosSAM is an open-vocabulary panoptic segmentation model that unifies the strengths of the Segment Anything Model (SAM) with the vision-native CLIP model in an end-to-end framework.
We introduce a Mask-Aware Selective Ensembling (MASE) algorithm that adaptively enhances the quality of generated masks and boosts the performance of open-vocabulary classification during inference for each image.
arXiv Detail & Related papers (2024-03-14T17:55:03Z) - RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for
Remote Sensing Image Semantic Segmentation [10.37240769959699]
Segment Anything Model (SAM) provides a universal pre-training model for image segmentation tasks.
We propose RSAM-Seg, which stands for Remote Sensing SAM with Semantic, as a tailored modification of SAM for the remote sensing field.
Adapter-Scale, a set of supplementary scaling modules, are proposed in the multi-head attention blocks of the encoder part of SAM.
Experiments are conducted on four distinct remote sensing scenarios, encompassing cloud detection, field monitoring, building detection and road mapping tasks.
arXiv Detail & Related papers (2024-02-29T09:55:46Z) - Learning to Prompt Segment Anything Models [55.805816693815835]
Segment Anything Models (SAMs) have demonstrated great potential in learning to segment anything.
SAMs work with two types of prompts including spatial prompts (e.g., points) and semantic prompts (e.g., texts)
We propose spatial-semantic prompt learning (SSPrompt) that learns effective semantic and spatial prompts for better SAMs.
arXiv Detail & Related papers (2024-01-09T16:24:25Z) - Boosting Segment Anything Model Towards Open-Vocabulary Learning [69.42565443181017]
Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model.
Despite SAM finding applications and adaptations in various domains, its primary limitation lies in the inability to grasp object semantics.
We present Sambor to seamlessly integrate SAM with the open-vocabulary object detector in an end-to-end framework.
arXiv Detail & Related papers (2023-12-06T17:19:00Z) - Self-guided Few-shot Semantic Segmentation for Remote Sensing Imagery
Based on Large Vision Models [14.292149307183967]
This research introduces a structured framework designed for the automation of few-shot semantic segmentation.
It utilizes the SAM model and facilitates a more efficient generation of semantically discernible segmentation outcomes.
Central to our methodology is a novel automatic prompt learning approach, leveraging prior guided masks to produce coarse pixel-wise prompts for SAM.
arXiv Detail & Related papers (2023-11-22T07:07:55Z) - Semantic-SAM: Segment and Recognize Anything at Any Granularity [83.64686655044765]
We introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
We consolidate multiple datasets across three granularities and introduce decoupled classification for objects and parts.
For the multi-granularity capability, we propose a multi-choice learning scheme during training, enabling each click to generate masks at multiple levels.
arXiv Detail & Related papers (2023-07-10T17:59:40Z) - The Segment Anything Model (SAM) for Remote Sensing Applications: From
Zero to One Shot [6.500451285898152]
This study aims to advance the application of the Segment Anything Model (SAM) in remote sensing image analysis.
SAM is known for its exceptional generalization capabilities and zero-shot learning.
Despite the limitations encountered with lower spatial resolution images, SAM exhibits promising adaptability to remote sensing data analysis.
arXiv Detail & Related papers (2023-06-29T01:49:33Z) - Personalize Segment Anything Model with One Shot [52.54453744941516]
We propose a training-free Personalization approach for Segment Anything Model (SAM)
Given only a single image with a reference mask, PerSAM first localizes the target concept by a location prior.
PerSAM segments it within other images or videos via three techniques: target-guided attention, target-semantic prompting, and cascaded post-refinement.
arXiv Detail & Related papers (2023-05-04T17:59:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.