Personalize Segment Anything Model with One Shot
- URL: http://arxiv.org/abs/2305.03048v2
- Date: Wed, 4 Oct 2023 01:15:21 GMT
- Title: Personalize Segment Anything Model with One Shot
- Authors: Renrui Zhang, Zhengkai Jiang, Ziyu Guo, Shilin Yan, Junting Pan,
Xianzheng Ma, Hao Dong, Peng Gao, Hongsheng Li
- Abstract summary: We propose a training-free Personalization approach for Segment Anything Model (SAM)
Given only a single image with a reference mask, PerSAM first localizes the target concept by a location prior.
PerSAM segments it within other images or videos via three techniques: target-guided attention, target-semantic prompting, and cascaded post-refinement.
- Score: 52.54453744941516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Driven by large-data pre-training, Segment Anything Model (SAM) has been
demonstrated as a powerful and promptable framework, revolutionizing the
segmentation models. Despite the generality, customizing SAM for specific
visual concepts without man-powered prompting is under explored, e.g.,
automatically segmenting your pet dog in different images. In this paper, we
propose a training-free Personalization approach for SAM, termed as PerSAM.
Given only a single image with a reference mask, PerSAM first localizes the
target concept by a location prior, and segments it within other images or
videos via three techniques: target-guided attention, target-semantic
prompting, and cascaded post-refinement. In this way, we effectively adapt SAM
for private use without any training. To further alleviate the mask ambiguity,
we present an efficient one-shot fine-tuning variant, PerSAM-F. Freezing the
entire SAM, we introduce two learnable weights for multi-scale masks, only
training 2 parameters within 10 seconds for improved performance. To
demonstrate our efficacy, we construct a new segmentation dataset, PerSeg, for
personalized evaluation, and test our methods on video object segmentation with
competitive performance. Besides, our approach can also enhance DreamBooth to
personalize Stable Diffusion for text-to-image generation, which discards the
background disturbance for better target appearance learning. Code is released
at https://github.com/ZrrSkywalker/Personalize-SAM
Related papers
- From SAM to SAM 2: Exploring Improvements in Meta's Segment Anything Model [0.5639904484784127]
The Segment Anything Model (SAM) was introduced to the computer vision community by Meta in April 2023.
SAM excels in zero-shot performance, segmenting unseen objects without additional training, stimulated by a large dataset of over one billion image masks.
SAM 2 expands this functionality to video, leveraging memory from preceding and subsequent frames to generate accurate segmentation across entire videos.
arXiv Detail & Related papers (2024-08-12T17:17:35Z) - MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation.
Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z) - PosSAM: Panoptic Open-vocabulary Segment Anything [58.72494640363136]
PosSAM is an open-vocabulary panoptic segmentation model that unifies the strengths of the Segment Anything Model (SAM) with the vision-native CLIP model in an end-to-end framework.
We introduce a Mask-Aware Selective Ensembling (MASE) algorithm that adaptively enhances the quality of generated masks and boosts the performance of open-vocabulary classification during inference for each image.
arXiv Detail & Related papers (2024-03-14T17:55:03Z) - WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images [8.179859593451285]
We present WSI-SAM, enhancing Segment Anything Model (SAM) with precise object segmentation capabilities for histopathology images.
To fully exploit pretrained knowledge while minimizing training overhead, we keep SAM frozen, introducing only minimal extra parameters.
Our model outperforms SAM by 4.1 and 2.5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task.
arXiv Detail & Related papers (2024-03-14T10:30:43Z) - BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning
of SAM [37.1263294647351]
We introduce BLO-SAM, which finetunes the Segment Anything Model (SAM) based on bi-level optimization (BLO)
BLO-SAM reduces the risk of overfitting by training the model's weight parameters and the prompt embedding on two separate subsets of the training dataset.
Results demonstrate BLO-SAM's superior performance over various state-of-the-art image semantic segmentation methods.
arXiv Detail & Related papers (2024-02-26T06:36:32Z) - BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model [65.92173280096588]
We address the challenge of image resolution variation for the Segment Anything Model (SAM)
SAM, known for its zero-shot generalizability, exhibits a performance degradation when faced with datasets with varying image sizes.
We present a bias-mode attention mask that allows each token to prioritize neighboring information.
arXiv Detail & Related papers (2024-01-04T15:34:44Z) - TinySAM: Pushing the Envelope for Efficient Segment Anything Model [76.21007576954035]
We propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance.
We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model.
We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost.
arXiv Detail & Related papers (2023-12-21T12:26:11Z) - How to Efficiently Adapt Large Segmentation Model(SAM) to Medical Images [15.181219203629643]
Segment Anything (SAM) exhibits impressive capabilities in zero-shot segmentation for natural images.
However, when applied to medical images, SAM suffers from noticeable performance drop.
In this work, we propose to freeze SAM encoder and finetune a lightweight task-specific prediction head.
arXiv Detail & Related papers (2023-06-23T18:34:30Z) - Segment Anything in High Quality [116.39405160133315]
We propose HQ-SAM, equipping SAM with the ability to accurately segment any object, while maintaining SAM's original promptable design, efficiency, and zero-shot generalizability.
Our careful design reuses and preserves the pre-trained model weights of SAM, while only introducing minimal additional parameters and computation.
We show the efficacy of HQ-SAM in a suite of 10 diverse segmentation datasets across different downstream tasks, where 8 out of them are evaluated in a zero-shot transfer protocol.
arXiv Detail & Related papers (2023-06-02T14:23:59Z) - Customized Segment Anything Model for Medical Image Segmentation [10.933449793055313]
We build upon the large-scale image segmentation model, Segment Anything Model (SAM), to explore the new research paradigm of customizing large-scale models for medical image segmentation.
SAMed applies the low-rank-based (LoRA) finetuning strategy to the SAM image encoder and finetunes it together with the prompt encoder and the mask decoder on labeled medical image segmentation datasets.
Our trained SAMed model achieves semantic segmentation on medical images, which is on par with the state-of-the-art methods.
arXiv Detail & Related papers (2023-04-26T19:05:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.