RP-SAM2: Refining Point Prompts for Stable Surgical Instrument Segmentation
- URL: http://arxiv.org/abs/2504.07117v1
- Date: Tue, 25 Mar 2025 18:59:23 GMT
- Title: RP-SAM2: Refining Point Prompts for Stable Surgical Instrument Segmentation
- Authors: Nuren Zhaksylyk, Ibrahim Almakky, Jay Paranjape, S. Swaroop Vedula, Shameema Sikder, Vishal M. Patel, Mohammad Yaqub,
- Abstract summary: We introduce RP-SAM2, which incorporates a novel shift block and a compound loss function to stabilize point prompts.<n>Our approach reduces annotator reliance on precise point positioning while maintaining robust segmentation capabilities.<n> Experiments on the Cataract1k dataset demonstrate that RP-SAM2 improves segmentation accuracy, with a 2% mDSC gain, a 21.36% reduction in mHD95, and decreased variance across random single-point prompt results compared to SAM2.
- Score: 21.258106247629296
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Accurate surgical instrument segmentation is essential in cataract surgery for tasks such as skill assessment and workflow optimization. However, limited annotated data makes it difficult to develop fully automatic models. Prompt-based methods like SAM2 offer flexibility yet remain highly sensitive to the point prompt placement, often leading to inconsistent segmentations. We address this issue by introducing RP-SAM2, which incorporates a novel shift block and a compound loss function to stabilize point prompts. Our approach reduces annotator reliance on precise point positioning while maintaining robust segmentation capabilities. Experiments on the Cataract1k dataset demonstrate that RP-SAM2 improves segmentation accuracy, with a 2% mDSC gain, a 21.36% reduction in mHD95, and decreased variance across random single-point prompt results compared to SAM2. Additionally, on the CaDIS dataset, pseudo masks generated by RP-SAM2 for fine-tuning SAM2's mask decoder outperformed those generated by SAM2. These results highlight RP-SAM2 as a practical, stable and reliable solution for semi-automatic instrument segmentation in data-constrained medical settings. The code is available at https://github.com/BioMedIA-MBZUAI/RP-SAM2.
Related papers
- VesSAM: Efficient Multi-Prompting for Segmenting Complex Vessel [68.24765319399286]
We present VesSAM, a powerful and efficient framework tailored for 2D vessel segmentation.<n>VesSAM integrates (1) a convolutional adapter to enhance local texture features, (2) a multi-prompt encoder that fuses anatomical prompts, and (3) a lightweight mask decoder to reduce jagged artifacts.<n>VesSAM consistently outperforms state-of-the-art PEFT-based SAM variants by over 10% Dice and 13% IoU.
arXiv Detail & Related papers (2025-11-02T15:47:05Z) - EMA-SAM: Exponential Moving-average for SAM-based PTMC Segmentation [1.7674345486888503]
EMA-SAM is a lightweight extension of SAM-2 that incorporates a confidence-weighted exponential moving average pointer into the memory bank.<n>On our PTMC-RFA dataset (124 minutes, 13 patients), EMA-SAM improves emphmaxDice from 0.82 to 0.86 and emphmaxIoU from 0.72 to 0.76, while reducing false positives by 29%.
arXiv Detail & Related papers (2025-10-21T01:30:27Z) - SAM2LoRA: Composite Loss-Guided, Parameter-Efficient Finetuning of SAM2 for Retinal Fundus Segmentation [0.5191474416940847]
We propose a parameter-efficient fine-tuning strategy that adapts the Segment Anything Model 2 (SAM2) for fundus image segmentation.<n>SAM2LoRA integrates a low-rank adapter into both the image encoder and mask decoder, requiring fewer than 5% of the original trainable parameters.<n>It achieves Dice scores of up to 0.86 and 0.93 for blood vessel and optic disc segmentation, respectively, and AUC values of up to 0.98 and 0.99.
arXiv Detail & Related papers (2025-10-11T17:07:44Z) - Accelerating Volumetric Medical Image Annotation via Short-Long Memory SAM 2 [10.279314732888079]
Short-Long Memory SAM 2 (SLM-SAM 2) is a novel architecture that integrates distinct short-term and long-term memory banks to improve segmentation accuracy.<n>We evaluate SLM-SAM 2 on three public datasets covering organs, bones, and muscles across MRI and CT modalities.
arXiv Detail & Related papers (2025-05-03T16:16:24Z) - BiSeg-SAM: Weakly-Supervised Post-Processing Framework for Boosting Binary Segmentation in Segment Anything Models [6.74659948545092]
BiSeg-SAM is a weakly supervised prompting and boundary refinement network for the segmentation of polyps and skin lesions.<n>Our method demonstrates significant superiority over state-of-the-art (SOTA) methods when tested on five polyp datasets and one skin cancer dataset.
arXiv Detail & Related papers (2025-04-02T08:04:37Z) - RFMedSAM 2: Automatic Prompt Refinement for Enhanced Volumetric Medical Image Segmentation with SAM 2 [15.50695315680438]
Segment Anything Model 2 (SAM 2), a prompt-driven foundation model extending SAM to both image and video domains, has shown superior zero-shot performance compared to its predecessor.<n>However, similar to SAM, SAM 2 is limited by its output of binary masks, inability to infer semantic labels, and dependence on precise prompts for the target object area.<n>We explore the upper performance limit of SAM 2 using custom fine-tuning adapters, achieving a Dice Similarity Coefficient (DSC) of 92.30% on the BTCV dataset.
arXiv Detail & Related papers (2025-02-04T22:03:23Z) - SAMPa: Sharpness-aware Minimization Parallelized [51.668052890249726]
Sharpness-aware (SAM) has been shown to improve the generalization of neural networks.
Each SAM update requires emphsequentially computing two gradients, effectively doubling the per-iteration cost.
We propose a simple modification of SAM, termed SAMPa, which allows us to fully parallelize the two gradient computations.
arXiv Detail & Related papers (2024-10-14T16:21:23Z) - AM-SAM: Automated Prompting and Mask Calibration for Segment Anything Model [28.343378406337077]
We propose an automated prompting and mask calibration method called AM-SAM.
Our approach automatically generates prompts for an input image, eliminating the need for human involvement with a good performance in early training epochs.
Our experimental results demonstrate that AM-SAM achieves significantly accurate segmentation, matching or exceeding the effectiveness of human-generated and default prompts.
arXiv Detail & Related papers (2024-10-13T03:47:20Z) - ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation.
SAM's Transformer-based structure prioritizes global and low-frequency information.
CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z) - Stable Segment Anything Model [79.9005670886038]
The Segment Anything Model (SAM) achieves remarkable promptable segmentation given high-quality prompts.
This paper presents the first comprehensive analysis on SAM's segmentation stability across a diverse spectrum of prompt qualities.
Our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality.
arXiv Detail & Related papers (2023-11-27T12:51:42Z) - SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation [65.52097667738884]
We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation.
Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes.
In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
arXiv Detail & Related papers (2023-08-17T02:51:01Z) - Systematic Investigation of Sparse Perturbed Sharpness-Aware
Minimization Optimizer [158.2634766682187]
Deep neural networks often suffer from poor generalization due to complex and non- unstructured loss landscapes.
SharpnessAware Minimization (SAM) is a popular solution that smooths the loss by minimizing the change of landscape when adding a perturbation.
In this paper, we propose Sparse SAM (SSAM), an efficient and effective training scheme that achieves perturbation by a binary mask.
arXiv Detail & Related papers (2023-06-30T09:33:41Z) - Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation
Approach [132.37966970098645]
One of the popular solutions is Sharpness-Aware Minimization (SAM), which minimizes the change of weight loss when adding a perturbation.
In this paper, we propose an efficient effective training scheme coined as Sparse SAM (SSAM), which achieves double overhead of common perturbations.
In addition, we theoretically prove that S can converge at the same SAM, i.e., $O(log T/sqrtTTTTTTTTTTTTTTTTT
arXiv Detail & Related papers (2022-10-11T06:30:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.