Boosting Few-Shot Semantic Segmentation Via Segment Anything Model
- URL: http://arxiv.org/abs/2401.09826v2
- Date: Sat, 20 Jan 2024 07:56:19 GMT
- Title: Boosting Few-Shot Semantic Segmentation Via Segment Anything Model
- Authors: Chen-Bin Feng, Qi Lai, Kangdao Liu, Houcheng Su, Chi-Man Vong
- Abstract summary: In semantic segmentation, accurate prediction masks are crucial for downstream tasks such as medical image analysis and image editing.
Due to the lack of annotated data, few-shot semantic segmentation (FSS) performs poorly in predicting masks with precise contours.
We propose FSS-SAM to boost FSS methods by addressing the issue of inaccurate contour.
- Score: 8.773067974503123
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In semantic segmentation, accurate prediction masks are crucial for
downstream tasks such as medical image analysis and image editing. Due to the
lack of annotated data, few-shot semantic segmentation (FSS) performs poorly in
predicting masks with precise contours. Recently, we have noticed that the
large foundation model segment anything model (SAM) performs well in processing
detailed features. Inspired by SAM, we propose FSS-SAM to boost FSS methods by
addressing the issue of inaccurate contour. The FSS-SAM is training-free. It
works as a post-processing tool for any FSS methods and can improve the
accuracy of predicted masks. Specifically, we use predicted masks from FSS
methods to generate prompts and then use SAM to predict new masks. To avoid
predicting wrong masks with SAM, we propose a prediction result selection (PRS)
algorithm. The algorithm can remarkably decrease wrong predictions. Experiment
results on public datasets show that our method is superior to base FSS methods
in both quantitative and qualitative aspects.
Related papers
- Success or Failure? Analyzing Segmentation Refinement with Few-Shot Segmentation [9.854182420661754]
We propose JFS, a method to identify the success of segmentation refinement leveraging a few-shot segmentation (FSS) model.
JFS is evaluated on the best and worst cases from SEPL to validate its effectiveness.
arXiv Detail & Related papers (2024-07-05T14:04:25Z) - From Generalization to Precision: Exploring SAM for Tool Segmentation in
Surgical Environments [7.01085327371458]
We argue that Segment Anything Model drastically over-segment images with high corruption levels, resulting in degraded performance.
We employ the ground-truth tool mask to analyze the results of SAM when the best single mask is selected as prediction.
We analyze the Endovis18 and Endovis17 instrument segmentation datasets using synthetic corruptions of various strengths and an In-House dataset featuring counterfactually created real-world corruptions.
arXiv Detail & Related papers (2024-02-28T01:33:49Z) - Systematic Investigation of Sparse Perturbed Sharpness-Aware
Minimization Optimizer [158.2634766682187]
Deep neural networks often suffer from poor generalization due to complex and non- unstructured loss landscapes.
SharpnessAware Minimization (SAM) is a popular solution that smooths the loss by minimizing the change of landscape when adding a perturbation.
In this paper, we propose Sparse SAM (SSAM), an efficient and effective training scheme that achieves perturbation by a binary mask.
arXiv Detail & Related papers (2023-06-30T09:33:41Z) - How to Efficiently Adapt Large Segmentation Model(SAM) to Medical Images [15.181219203629643]
Segment Anything (SAM) exhibits impressive capabilities in zero-shot segmentation for natural images.
However, when applied to medical images, SAM suffers from noticeable performance drop.
In this work, we propose to freeze SAM encoder and finetune a lightweight task-specific prediction head.
arXiv Detail & Related papers (2023-06-23T18:34:30Z) - Segment Anything in High Quality [116.39405160133315]
We propose HQ-SAM, equipping SAM with the ability to accurately segment any object, while maintaining SAM's original promptable design, efficiency, and zero-shot generalizability.
Our careful design reuses and preserves the pre-trained model weights of SAM, while only introducing minimal additional parameters and computation.
We show the efficacy of HQ-SAM in a suite of 10 diverse segmentation datasets across different downstream tasks, where 8 out of them are evaluated in a zero-shot transfer protocol.
arXiv Detail & Related papers (2023-06-02T14:23:59Z) - Personalize Segment Anything Model with One Shot [52.54453744941516]
We propose a training-free Personalization approach for Segment Anything Model (SAM)
Given only a single image with a reference mask, PerSAM first localizes the target concept by a location prior.
PerSAM segments it within other images or videos via three techniques: target-guided attention, target-semantic prompting, and cascaded post-refinement.
arXiv Detail & Related papers (2023-05-04T17:59:36Z) - Improving Sharpness-Aware Minimization with Fisher Mask for Better
Generalization on Language Models [93.85178920914721]
Fine-tuning large pretrained language models on a limited training corpus usually suffers from poor computation.
We propose a novel optimization procedure, namely FSAM, which introduces a Fisher mask to improve the efficiency and performance of SAM.
We show that FSAM consistently outperforms the vanilla SAM by 0.671.98 average score among four different pretrained models.
arXiv Detail & Related papers (2022-10-11T14:53:58Z) - HMFS: Hybrid Masking for Few-Shot Segmentation [27.49000348046462]
We develop a simple, effective, and efficient approach to enhance feature masking (FM)
We compensate for the loss of fine-grained spatial details in FM technique by investigating and leveraging a complementary basic input masking method.
Experimental results on three publicly available benchmarks reveal that HMFS outperforms the current state-of-the-art methods by visible margins.
arXiv Detail & Related papers (2022-03-24T03:07:20Z) - Improving Self-supervised Pre-training via a Fully-Explored Masked
Language Model [57.77981008219654]
Masked Language Model (MLM) framework has been widely adopted for self-supervised language pre-training.
We propose a fully-explored masking strategy, where a text sequence is divided into a certain number of non-overlapping segments.
arXiv Detail & Related papers (2020-10-12T21:28:14Z) - SipMask: Spatial Information Preservation for Fast Image and Video
Instance Segmentation [149.242230059447]
We propose a fast single-stage instance segmentation method called SipMask.
It preserves instance-specific spatial information by separating mask prediction of an instance to different sub-regions of a detected bounding-box.
In terms of real-time capabilities, SipMask outperforms YOLACT with an absolute gain of 3.0% (mask AP) under similar settings.
arXiv Detail & Related papers (2020-07-29T12:21:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.