Related papers: PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images

PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images

URL: http://arxiv.org/abs/2409.13401v2
Date: Sun, 12 Jan 2025 15:10:26 GMT
Title: PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images
Authors: Nanqing Liu, Xun Xu, Yongyi Su, Haojie Zhang, Heng-Chao Li,
Abstract summary: We propose a novel Pointly-supervised Segment Anything Model named PointSAM.<n>We conduct experiments on RSI datasets, including WHU, HRSID, and NWPU VHR-10.<n>The results show that our method significantly outperforms direct testing with SAM, SAM2, and other comparison methods.
Score: 16.662173255725463
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Segment Anything Model (SAM) is an advanced foundational model for image segmentation, which is gradually being applied to remote sensing images (RSIs). Due to the domain gap between RSIs and natural images, traditional methods typically use SAM as a source pre-trained model and fine-tune it with fully supervised masks. Unlike these methods, our work focuses on fine-tuning SAM using more convenient and challenging point annotations. Leveraging SAM's zero-shot capabilities, we adopt a self-training framework that iteratively generates pseudo-labels for training. However, if the pseudo-labels contain noisy labels, there is a risk of error accumulation. To address this issue, we extract target prototypes from the target dataset and use the Hungarian algorithm to match them with prediction prototypes, preventing the model from learning in the wrong direction. Additionally, due to the complex backgrounds and dense distribution of objects in RSI, using point prompts may result in multiple objects being recognized as one. To solve this problem, we propose a negative prompt calibration method based on the non-overlapping nature of instance masks. In brief, we use the prompts of overlapping masks as corresponding negative signals, resulting in refined masks. Combining the above methods, we propose a novel Pointly-supervised Segment Anything Model named PointSAM. We conduct experiments on RSI datasets, including WHU, HRSID, and NWPU VHR-10, and the results show that our method significantly outperforms direct testing with SAM, SAM2, and other comparison methods. Furthermore, we introduce PointSAM as a point-to-box converter and achieve encouraging results, suggesting that this method can be extended to other point-supervised tasks. The code is available at https://github.com/Lans1ng/PointSAM.

Related papers

ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction [57.930531826380836]
This work explores whether a foundational segmentation model can address label scarcity in the pixel-level vision task as an annotator for unlabeled images.<n>We propose ConformalSAM, a novel SSSS framework which first calibrates the foundation model using the target domain's labeled data and then filters out unreliable pixel labels of unlabeled data.
arXiv Detail & Related papers (2025-07-21T17:02:57Z)
Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization [54.91271106816616]
We propose an innovative mask prompt to SAM (Pro2SAM) network with grid points for WSOL task.<n>First, we devise a Global Token Transformer (GTFormer) to generate a coarse-grained foreground map as a flexible mask prompt.<n> Secondly, we deliver grid points as dense prompts into SAM to maximize the probability of foreground mask.
arXiv Detail & Related papers (2025-05-08T02:44:53Z)
SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement [40.37217744643069]
We propose a universal and efficient approach by adapting SAM to the mask refinement task. Specifically, we introduce a multi-prompt excavation strategy to mine diverse input prompts for SAM. We extend our method to SAMRefiner++ by introducing an additional IoU adaption step to further boost the performance of the generic SAMRefiner on the target dataset.
arXiv Detail & Related papers (2025-02-10T18:33:15Z)
Auto-Prompting SAM for Weakly Supervised Landslide Extraction [17.515220489213743]
We propose a simple yet effective method by auto-prompting the Segment Anything Model (SAM) Instead of depending on high-quality class activation maps (CAMs) for pseudo-labeling or fine-tuning SAM, our method directly yields fine-grained segmentation masks from SAM inference through prompt engineering. Experimental results on high-resolution aerial and satellite datasets demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2025-01-23T07:08:48Z)
Bridge the Points: Graph-based Few-shot Segment Anything Semantically [79.1519244940518]
Recent advancements in pre-training techniques have enhanced the capabilities of vision foundation models. Recent studies extend the SAM to Few-shot Semantic segmentation (FSS) We propose a simple yet effective approach based on graph analysis.
arXiv Detail & Related papers (2024-10-09T15:02:28Z)
Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments. We propose UOIS-SAM, a data-efficient solution for the UOIS task. UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z)
One Shot is Enough for Sequential Infrared Small Target Segmentation [9.354927663020586]
Infrared small target sequences exhibit strong similarities between frames and contain rich contextual information. We propose a one-shot and training-free method that perfectly adapts SAM's zero-shot generalization capability to sequential IRSTS. Experiments demonstrate that our method requires only one shot to achieve comparable performance to state-of-the-art IRSTS methods.
arXiv Detail & Related papers (2024-08-09T02:36:56Z)
AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts. We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z)
WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images [8.179859593451285]
We present WSI-SAM, enhancing Segment Anything Model (SAM) with precise object segmentation capabilities for histopathology images. To fully exploit pretrained knowledge while minimizing training overhead, we keep SAM frozen, introducing only minimal extra parameters. Our model outperforms SAM by 4.1 and 2.5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task.
arXiv Detail & Related papers (2024-03-14T10:30:43Z)
Stable Segment Anything Model [79.9005670886038]
The Segment Anything Model (SAM) achieves remarkable promptable segmentation given high-quality prompts. This paper presents the first comprehensive analysis on SAM's segmentation stability across a diverse spectrum of prompt qualities. Our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality.
arXiv Detail & Related papers (2023-11-27T12:51:42Z)
SAM-Deblur: Let Segment Anything Boost Image Deblurring [21.964258084389243]
We propose a framework SAM-Deblur, integrating prior knowledge from the Segment Anything Model (SAM) into the deblurring task. Experimental results on the RealBlurJ, ReloBlur, and REDS datasets reveal that incorporating our methods improves GoPro-trained NAFNet's PSNR by 0.05, 0.96, and 7.03, respectively.
arXiv Detail & Related papers (2023-09-05T14:33:56Z)
Zero-Shot Edge Detection with SCESAME: Spectral Clustering-based Ensemble for Segment Anything Model Estimation [0.0]
This paper proposes a novel zero-shot edge detection with SCESAME, based on the recently proposed Segment Anything Model (SAM) AMG can be applied to edge detection, but suffers from the problem of overdetecting edges. We performed edge detection experiments on two datasets, BSDS500 and NYUDv2.
arXiv Detail & Related papers (2023-08-26T06:19:59Z)
RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model [29.42043345787285]
We propose a method to learn the generation of appropriate prompts for Segment Anything Model (SAM) This enables SAM to produce semantically discernible segmentation results for remote sensing images. We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter.
arXiv Detail & Related papers (2023-06-28T14:51:34Z)
Personalize Segment Anything Model with One Shot [52.54453744941516]
We propose a training-free Personalization approach for Segment Anything Model (SAM) Given only a single image with a reference mask, PerSAM first localizes the target concept by a location prior. PerSAM segments it within other images or videos via three techniques: target-guided attention, target-semantic prompting, and cascaded post-refinement.
arXiv Detail & Related papers (2023-05-04T17:59:36Z)
Semi-Supervised Domain Adaptation with Prototypical Alignment and Consistency Learning [86.6929930921905]
This paper studies how much it can help address domain shifts if we further have a few target samples labeled. To explore the full potential of landmarks, we incorporate a prototypical alignment (PA) module which calculates a target prototype for each class from the landmarks. Specifically, we severely perturb the labeled images, making PA non-trivial to achieve and thus promoting model generalizability.
arXiv Detail & Related papers (2021-04-19T08:46:08Z)
Weakly-Supervised Saliency Detection via Salient Object Subitizing [57.17613373230722]
We introduce saliency subitizing as the weak supervision since it is class-agnostic. This allows the supervision to be aligned with the property of saliency detection. We conduct extensive experiments on five benchmark datasets.
arXiv Detail & Related papers (2021-01-04T12:51:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.