Prompt-Tuning SAM: From Generalist to Specialist with only 2048 Parameters and 16 Training Images
- URL: http://arxiv.org/abs/2504.16739v1
- Date: Wed, 23 Apr 2025 14:10:02 GMT
- Title: Prompt-Tuning SAM: From Generalist to Specialist with only 2048 Parameters and 16 Training Images
- Authors: Tristan Piater, Björn Barz, Alexander Freytag,
- Abstract summary: The PTSAM method uses prompt-tuning, a parameter-efficient fine-tuning technique, to adapt SAM for a specific task.<n>Our results show that prompt-tuning only SAM's mask decoder already leads to a performance on-par with state-of-the-art techniques.
- Score: 48.76247995109632
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Segment Anything Model (SAM) is widely used for segmenting a diverse range of objects in natural images from simple user prompts like points or bounding boxes. However, SAM's performance decreases substantially when applied to non-natural domains like microscopic imaging. Furthermore, due to SAM's interactive design, it requires a precise prompt for each image and object, which is unfeasible in many automated biomedical applications. Previous solutions adapt SAM by training millions of parameters via fine-tuning large parts of the model or of adapter layers. In contrast, we show that as little as 2,048 additional parameters are sufficient for turning SAM into a use-case specialist for a certain downstream task. Our novel PTSAM (prompt-tuned SAM) method uses prompt-tuning, a parameter-efficient fine-tuning technique, to adapt SAM for a specific task. We validate the performance of our approach on multiple microscopic and one medical dataset. Our results show that prompt-tuning only SAM's mask decoder already leads to a performance on-par with state-of-the-art techniques while requiring roughly 2,000x less trainable parameters. For addressing domain gaps, we find that additionally prompt-tuning SAM's image encoder is beneficial, further improving segmentation accuracy by up to 18% over state-of-the-art results. Since PTSAM can be reliably trained with as little as 16 annotated images, we find it particularly helpful for applications with limited training data and domain shifts.
Related papers
- Generalized SAM: Efficient Fine-Tuning of SAM for Variable Input Image Sizes [3.8506666685467343]
We propose a novel efficient fine-tuning method that allows the input image size of Segment Anything Model (SAM) to be variable.
Generalized SAM (GSAM) is the first to apply random cropping during training with SAM, thereby significantly reducing the computational cost of training.
arXiv Detail & Related papers (2024-08-22T13:58:08Z) - S-SAM: SVD-based Fine-Tuning of Segment Anything Model for Medical Image Segmentation [25.12190845061075]
We propose an adaptation technique, called S-SAM, that only trains parameters equal to 0.4% of SAM's parameters and at the same time uses simply the label names as prompts for producing precise masks.
We call this modified version S-SAM and evaluate it on five different modalities including endoscopic images, x-ray, ultrasound, CT, and histology images.
arXiv Detail & Related papers (2024-08-12T18:53:03Z) - WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images [8.179859593451285]
We present WSI-SAM, enhancing Segment Anything Model (SAM) with precise object segmentation capabilities for histopathology images.
To fully exploit pretrained knowledge while minimizing training overhead, we keep SAM frozen, introducing only minimal extra parameters.
Our model outperforms SAM by 4.1 and 2.5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task.
arXiv Detail & Related papers (2024-03-14T10:30:43Z) - Stable Segment Anything Model [79.9005670886038]
The Segment Anything Model (SAM) achieves remarkable promptable segmentation given high-quality prompts.
This paper presents the first comprehensive analysis on SAM's segmentation stability across a diverse spectrum of prompt qualities.
Our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality.
arXiv Detail & Related papers (2023-11-27T12:51:42Z) - AdaptiveSAM: Towards Efficient Tuning of SAM for Surgical Scene
Segmentation [49.59991322513561]
We propose an adaptive modification of Segment-Anything (SAM) that can adjust to new datasets quickly and efficiently.
AdaptiveSAM uses free-form text as prompt and can segment the object of interest with just the label name as prompt.
Our experiments show that AdaptiveSAM outperforms current state-of-the-art methods on various medical imaging datasets.
arXiv Detail & Related papers (2023-08-07T17:12:54Z) - When SAM Meets Sonar Images [6.902760999492406]
Segment Anything Model (SAM) has revolutionized the way of segmentation.
SAM's performance may decline when applied to tasks involving domains that differ from natural images.
By employing fine-tuning techniques, SAM exhibits promising capabilities in specific domains, such as medicine and planetary science.
arXiv Detail & Related papers (2023-06-25T03:15:14Z) - How to Efficiently Adapt Large Segmentation Model(SAM) to Medical Images [15.181219203629643]
Segment Anything (SAM) exhibits impressive capabilities in zero-shot segmentation for natural images.
However, when applied to medical images, SAM suffers from noticeable performance drop.
In this work, we propose to freeze SAM encoder and finetune a lightweight task-specific prediction head.
arXiv Detail & Related papers (2023-06-23T18:34:30Z) - AutoSAM: Adapting SAM to Medical Images by Overloading the Prompt
Encoder [101.28268762305916]
In this work, we replace Segment Anything Model with an encoder that operates on the same input image.
We obtain state-of-the-art results on multiple medical images and video benchmarks.
For inspecting the knowledge within it, and providing a lightweight segmentation solution, we also learn to decode it into a mask by a shallow deconvolution network.
arXiv Detail & Related papers (2023-06-10T07:27:00Z) - Personalize Segment Anything Model with One Shot [52.54453744941516]
We propose a training-free Personalization approach for Segment Anything Model (SAM)
Given only a single image with a reference mask, PerSAM first localizes the target concept by a location prior.
PerSAM segments it within other images or videos via three techniques: target-guided attention, target-semantic prompting, and cascaded post-refinement.
arXiv Detail & Related papers (2023-05-04T17:59:36Z) - Medical SAM Adapter: Adapting Segment Anything Model for Medical Image
Segmentation [51.770805270588625]
The Segment Anything Model (SAM) has recently gained popularity in the field of image segmentation.
Recent studies and individual experiments have shown that SAM underperforms in medical image segmentation.
We propose the Medical SAM Adapter (Med-SA), which incorporates domain-specific medical knowledge into the segmentation model.
arXiv Detail & Related papers (2023-04-25T07:34:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.