RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for
Remote Sensing Image Semantic Segmentation
- URL: http://arxiv.org/abs/2402.19004v1
- Date: Thu, 29 Feb 2024 09:55:46 GMT
- Title: RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for
Remote Sensing Image Semantic Segmentation
- Authors: Jie Zhang, Xubing Yang, Rui Jiang, Wei Shao and Li Zhang
- Abstract summary: Segment Anything Model (SAM) provides a universal pre-training model for image segmentation tasks.
We propose RSAM-Seg, which stands for Remote Sensing SAM with Semantic, as a tailored modification of SAM for the remote sensing field.
Adapter-Scale, a set of supplementary scaling modules, are proposed in the multi-head attention blocks of the encoder part of SAM.
Experiments are conducted on four distinct remote sensing scenarios, encompassing cloud detection, field monitoring, building detection and road mapping tasks.
- Score: 10.37240769959699
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The development of high-resolution remote sensing satellites has provided
great convenience for research work related to remote sensing. Segmentation and
extraction of specific targets are essential tasks when facing the vast and
complex remote sensing images. Recently, the introduction of Segment Anything
Model (SAM) provides a universal pre-training model for image segmentation
tasks. While the direct application of SAM to remote sensing image segmentation
tasks does not yield satisfactory results, we propose RSAM-Seg, which stands
for Remote Sensing SAM with Semantic Segmentation, as a tailored modification
of SAM for the remote sensing field and eliminates the need for manual
intervention to provide prompts. Adapter-Scale, a set of supplementary scaling
modules, are proposed in the multi-head attention blocks of the encoder part of
SAM. Furthermore, Adapter-Feature are inserted between the Vision Transformer
(ViT) blocks. These modules aim to incorporate high-frequency image information
and image embedding features to generate image-informed prompts. Experiments
are conducted on four distinct remote sensing scenarios, encompassing cloud
detection, field monitoring, building detection and road mapping tasks . The
experimental results not only showcase the improvement over the original SAM
and U-Net across cloud, buildings, fields and roads scenarios, but also
highlight the capacity of RSAM-Seg to discern absent areas within the ground
truth of certain datasets, affirming its potential as an auxiliary annotation
method. In addition, the performance in few-shot scenarios is commendable,
underscores its potential in dealing with limited datasets.
Related papers
- IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection [55.554484379021524]
Infrared Small Target Detection (IRSTD) task falls short in achieving satisfying performance due to a notable domain gap between natural and infrared images.
We propose the IRSAM model for IRSTD, which improves SAM's encoder-decoder architecture to learn better feature representation of infrared small objects.
arXiv Detail & Related papers (2024-07-10T10:17:57Z) - MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation.
Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z) - RS-Mamba for Large Remote Sensing Image Dense Prediction [58.12667617617306]
We propose the Remote Sensing Mamba (RSM) for dense prediction tasks in large VHR remote sensing images.
RSM is specifically designed to capture the global context of remote sensing images with linear complexity.
Our model achieves better efficiency and accuracy than transformer-based models on large remote sensing images.
arXiv Detail & Related papers (2024-04-03T12:06:01Z) - Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks [47.646824158039664]
We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to combine with the segment anything model (SAM)
This integration enables the detection and segmentation of any regions based on arbitrary text inputs.
arXiv Detail & Related papers (2024-01-25T13:12:09Z) - ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment
Anything to SAR Domain for Semantic Segmentation [6.229326337093342]
Segment Anything Model (SAM) excels in various segmentation scenarios relying on semantic information and generalization ability.
The ClassWiseSAM-Adapter (CWSAM) is designed to adapt the high-performing SAM for landcover classification on space-borne Synthetic Aperture Radar (SAR) images.
CWSAM showcases enhanced performance with fewer computing resources.
arXiv Detail & Related papers (2024-01-04T15:54:45Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Adapting Segment Anything Model for Change Detection in HR Remote
Sensing Images [18.371087310792287]
This work aims to utilize the strong visual recognition capabilities of Vision Foundation Models (VFMs) to improve the change detection of high-resolution Remote Sensing Images (RSIs)
We employ the visual encoder of FastSAM, an efficient variant of the SAM, to extract visual representations in RS scenes.
To utilize the semantic representations that are inherent to SAM features, we introduce a task-agnostic semantic learning branch to model the semantic latent in bi-temporal RSIs.
The resulting method, SAMCD, obtains superior accuracy compared to the SOTA methods and exhibits a sample-efficient learning ability that is comparable to semi-
arXiv Detail & Related papers (2023-09-04T08:23:31Z) - The Segment Anything Model (SAM) for Remote Sensing Applications: From
Zero to One Shot [6.500451285898152]
This study aims to advance the application of the Segment Anything Model (SAM) in remote sensing image analysis.
SAM is known for its exceptional generalization capabilities and zero-shot learning.
Despite the limitations encountered with lower spatial resolution images, SAM exhibits promising adaptability to remote sensing data analysis.
arXiv Detail & Related papers (2023-06-29T01:49:33Z) - RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation
based on Visual Foundation Model [29.42043345787285]
We propose a method to learn the generation of appropriate prompts for Segment Anything Model (SAM)
This enables SAM to produce semantically discernible segmentation results for remote sensing images.
We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter.
arXiv Detail & Related papers (2023-06-28T14:51:34Z) - SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment
Anything Model [85.85899655118087]
We develop an efficient pipeline for generating a large-scale RS segmentation dataset, dubbed SAMRS.
SAMRS totally possesses 105,090 images and 1,668,241 instances, surpassing existing high-resolution RS segmentation datasets in size by several orders of magnitude.
arXiv Detail & Related papers (2023-05-03T10:58:07Z) - SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in
Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and
More [13.047310918166762]
We propose textbfSAM-Adapter, which incorporates domain-specific information or visual prompts into the segmentation network by using simple yet effective adapters.
We can even outperform task-specific network models and achieve state-of-the-art performance in the task we tested: camouflaged object detection.
arXiv Detail & Related papers (2023-04-18T17:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.