Related papers: Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation

Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation

URL: http://arxiv.org/abs/2408.08576v1
Date: Fri, 16 Aug 2024 07:23:22 GMT
Title: Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation
Authors: Linghao Zheng, Xinyang Pu, Feng Xu,
Abstract summary: The Segment Anything Model (SAM) demonstrates exceptional generalization capabilities. SAM's lack of pretraining on massive remote sensing images and its interactive structure limit its automatic mask prediction capabilities. A Multi- cognitive SAM-Based Instance Model (MC-SAM SEG) is introduced to employ SAM on remote sensing domain. The proposed method named MC-SAM SEG extracts high-quality features by fine-tuning the SAM-Mona encoder along with a feature aggregator.
Score: 4.6570959687411975
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Segment Anything Model (SAM), a foundational model designed for promptable segmentation tasks, demonstrates exceptional generalization capabilities, making it highly promising for natural scene image segmentation. However, SAM's lack of pretraining on massive remote sensing images and its interactive structure limit its automatic mask prediction capabilities. In this paper, a Multi-Cognitive SAM-Based Instance Segmentation Model (MC-SAM SEG) is introduced to employ SAM on remote sensing domain. The SAM-Mona encoder utilizing the Multi-cognitive Visual Adapter (Mona) is conducted to facilitate SAM's transfer learning in remote sensing applications. The proposed method named MC-SAM SEG extracts high-quality features by fine-tuning the SAM-Mona encoder along with a feature aggregator. Subsequently, a pixel decoder and transformer decoder are designed for prompt-free mask generation and instance classification. The comprehensive experiments are conducted on the HRSID and WHU datasets for instance segmentation tasks on Synthetic Aperture Radar (SAR) images and optical remote sensing images respectively. The evaluation results indicate the proposed method surpasses other deep learning algorithms and verify its effectiveness and generalization.

Related papers

Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments. We propose UOIS-SAM, a data-efficient solution for the UOIS task. UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z)
Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection [58.241593208031816]
Segment Anything Model (SAM) has been proposed as a visual fundamental model, which gives strong segmentation and generalization capabilities. We propose a Multi-scale and Detail-enhanced SAM (MDSAM) for Salient Object Detection (SOD) Experimental results demonstrate the superior performance of our model on multiple SOD datasets.
arXiv Detail & Related papers (2024-08-08T09:09:37Z)
MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation. Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z)
SAMCT: Segment Any CT Allowing Labor-Free Task-Indicator Prompts [28.171383990186904]
We construct a large CT dataset consisting of 1.1M CT images and 5M masks from public datasets. We propose a powerful foundation model SAMCT allowing labor-free prompts. Based on SAM, SAMCT is further equipped with a CNN image encoder, a cross-branch interaction module, and a task-indicator prompt encoder.
arXiv Detail & Related papers (2024-03-20T02:39:15Z)
RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation [10.37240769959699]
Segment Anything Model (SAM) provides a universal pre-training model for image segmentation tasks. We propose RSAM-Seg, which stands for Remote Sensing SAM with Semantic, as a tailored modification of SAM for the remote sensing field. Adapter-Scale, a set of supplementary scaling modules, are proposed in the multi-head attention blocks of the encoder part of SAM. Experiments are conducted on four distinct remote sensing scenarios, encompassing cloud detection, field monitoring, building detection and road mapping tasks.
arXiv Detail & Related papers (2024-02-29T09:55:46Z)
VRP-SAM: SAM with Visual Reference Prompt [73.05676082695459]
We propose a novel Visual Reference Prompt (VRP) encoder that empowers the Segment Anything Model (SAM) to utilize annotated reference images as prompts for segmentation. In essence, VRP-SAM can utilize annotated reference images to comprehend specific objects and perform segmentation of specific objects in target image.
arXiv Detail & Related papers (2024-02-27T17:58:09Z)
ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation [6.229326337093342]
Segment Anything Model (SAM) excels in various segmentation scenarios relying on semantic information and generalization ability. The ClassWiseSAM-Adapter (CWSAM) is designed to adapt the high-performing SAM for landcover classification on space-borne Synthetic Aperture Radar (SAR) images. CWSAM showcases enhanced performance with fewer computing resources.
arXiv Detail & Related papers (2024-01-04T15:54:45Z)
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation [53.4319652364256]
This paper presents the RefSAM model, which explores the potential of SAM for referring video object segmentation. Our proposed approach adapts the original SAM model to enhance cross-modality learning by employing a lightweight Cross-RValModal. We employ a parameter-efficient tuning strategy to align and fuse the language and vision features effectively.
arXiv Detail & Related papers (2023-07-03T13:21:58Z)
The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot [6.500451285898152]
This study aims to advance the application of the Segment Anything Model (SAM) in remote sensing image analysis. SAM is known for its exceptional generalization capabilities and zero-shot learning. Despite the limitations encountered with lower spatial resolution images, SAM exhibits promising adaptability to remote sensing data analysis.
arXiv Detail & Related papers (2023-06-29T01:49:33Z)
RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model [29.42043345787285]
We propose a method to learn the generation of appropriate prompts for Segment Anything Model (SAM) This enables SAM to produce semantically discernible segmentation results for remote sensing images. We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter.
arXiv Detail & Related papers (2023-06-28T14:51:34Z)
AutoSAM: Adapting SAM to Medical Images by Overloading the Prompt Encoder [101.28268762305916]
In this work, we replace Segment Anything Model with an encoder that operates on the same input image. We obtain state-of-the-art results on multiple medical images and video benchmarks. For inspecting the knowledge within it, and providing a lightweight segmentation solution, we also learn to decode it into a mask by a shallow deconvolution network.
arXiv Detail & Related papers (2023-06-10T07:27:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.