Comparing SAM 2 and SAM 3 for Zero-Shot Segmentation of 3D Medical Data
- URL: http://arxiv.org/abs/2511.21926v1
- Date: Wed, 26 Nov 2025 21:36:58 GMT
- Title: Comparing SAM 2 and SAM 3 for Zero-Shot Segmentation of 3D Medical Data
- Authors: Satrajit Chakrabarty, Ravi Soni,
- Abstract summary: Foundation models for promptable segmentation, including SAM, SAM 2, and the recently released SAM 3, have renewed interest in zero-shot segmentation of medical imaging.<n>We present the first controlled comparison of SAM 2 and SAM 3 for zero-shot segmentation of 3D medical volumes and videos under purely visual prompting.<n>We benchmark both models on 16 public datasets covering 54 anatomical structures, pathologies, and surgical instruments.
- Score: 0.3867363075280543
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Foundation models for promptable segmentation, including SAM, SAM 2, and the recently released SAM 3, have renewed interest in zero-shot segmentation of medical imaging. Although these models perform strongly on natural images, their behavior on medical data remains insufficiently characterized. While SAM 2 is widely used for annotation in 3D medical workflows, SAM 3 introduces a new perception backbone, detector-tracker pipeline, and concept-level prompting that may alter its behavior under spatial prompts. We present the first controlled comparison of SAM 2 and SAM 3 for zero-shot segmentation of 3D medical volumes and videos under purely visual prompting, with concept mechanisms disabled. We assess whether SAM 3 can serve as an out-of-the-box replacement for SAM 2 without customization. We benchmark both models on 16 public datasets (CT, MRI, 3D and cine ultrasound, endoscopy) covering 54 anatomical structures, pathologies, and surgical instruments. Prompts are restricted to the first frame and use four modes: single-click, multi-click, bounding box, and dense mask. This design standardizes preprocessing, prompt placement, propagation rules, and metric computation to disentangle prompt interpretation from propagation. Prompt-frame analysis shows that SAM 3 provides substantially stronger initialization than SAM 2 for click prompting across most structures. In full-volume analysis, SAM 3 retains this advantage for complex, vascular, and soft-tissue anatomies, emerging as the more versatile general-purpose segmenter. While SAM 2 remains competitive for compact, rigid organs under strong spatial guidance, it frequently fails on challenging targets where SAM 3 succeeds. Overall, our results suggest that SAM 3 is the superior default choice for most medical segmentation tasks, particularly those involving sparse user interaction or complex anatomical topology.
Related papers
- Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation [23.51581338204315]
We present Medical SAM3, a foundation model for universal prompt-driven medical image segmentation.<n>By fine-tuning SAM3's model parameters on 33 datasets spanning 10 medical imaging modalities, Medical SAM3 acquires robust domain-specific representations.<n>Our results establish Medical SAM3 as a universal, text-guided segmentation foundation model for medical imaging.
arXiv Detail & Related papers (2026-01-15T22:18:14Z) - More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery [23.89471097213949]
SAM 3 supports zero-shot segmentation across a wide range of prompts, including point, bounding box, and language-based prompts.<n> SAM 3D shows clear improvements over SAM and SAM 2 in both image and video segmentation under spatial prompts.<n>While language prompts show potential, their performance in the surgical domain is currently suboptimal.
arXiv Detail & Related papers (2025-12-08T14:43:40Z) - VesSAM: Efficient Multi-Prompting for Segmenting Complex Vessel [68.24765319399286]
We present VesSAM, a powerful and efficient framework tailored for 2D vessel segmentation.<n>VesSAM integrates (1) a convolutional adapter to enhance local texture features, (2) a multi-prompt encoder that fuses anatomical prompts, and (3) a lightweight mask decoder to reduce jagged artifacts.<n>VesSAM consistently outperforms state-of-the-art PEFT-based SAM variants by over 10% Dice and 13% IoU.
arXiv Detail & Related papers (2025-11-02T15:47:05Z) - WeakMedSAM: Weakly-Supervised Medical Image Segmentation via SAM with Sub-Class Exploration and Prompt Affinity Mining [31.81408955413914]
We investigate a weakly-supervised SAM-based segmentation model, namely WeakMedSAM, to reduce the labeling cost.<n>Specifically, our proposed WeakMedSAM contains two modules: 1) to mitigate severe co-occurrence in medical images, and 2) to improve the quality of the class activation maps.<n>Our method can be applied to any SAM-like backbone, and we conduct experiments with SAMUS and EfficientSAM.
arXiv Detail & Related papers (2025-03-06T05:28:44Z) - Learnable Prompting SAM-induced Knowledge Distillation for Semi-supervised Medical Image Segmentation [47.789013598970925]
We propose a learnable prompting SAM-induced Knowledge distillation framework (KnowSAM) for semi-supervised medical image segmentation.<n>Our model outperforms the state-of-the-art semi-supervised segmentation approaches.
arXiv Detail & Related papers (2024-12-18T11:19:23Z) - DB-SAM: Delving into High Quality Universal Medical Image Segmentation [100.63434169944853]
We propose a dual-branch adapted SAM framework, named DB-SAM, to bridge the gap between natural and 2D/3D medical data.
Our proposed DB-SAM achieves an absolute gain of 8.8%, compared to a recent medical SAM adapter in the literature.
arXiv Detail & Related papers (2024-10-05T14:36:43Z) - Is SAM 2 Better than SAM in Medical Image Segmentation? [0.6144680854063939]
The Segment Anything Model (SAM) has demonstrated impressive performance in zero-shot promptable segmentation on natural images.
The recently released Segment Anything Model 2 (SAM 2) claims to outperform SAM on images and extends the model's capabilities to video segmentation.
We conducted extensive studies using multiple datasets to compare the performance of SAM and SAM 2.
arXiv Detail & Related papers (2024-08-08T04:34:29Z) - Interactive 3D Medical Image Segmentation with SAM 2 [17.523874868612577]
We explore the zero-shot capabilities of SAM 2, the next-generation Meta SAM model trained on videos, for 3D medical image segmentation.<n>By treating sequential 2D slices of 3D images as video frames, SAM 2 can fully automatically propagate annotations from a single frame to the entire 3D volume.
arXiv Detail & Related papers (2024-08-05T16:58:56Z) - SAM-Med3D: Towards General-purpose Segmentation Models for Volumetric Medical Images [35.83393121891959]
We introduce SAM-Med3D for general-purpose segmentation on volumetric medical images.
SAM-Med3D can accurately segment diverse anatomical structures and lesions across various modalities.
Our approach demonstrates that substantial medical resources can be utilized to develop a general-purpose medical AI.
arXiv Detail & Related papers (2023-10-23T17:57:36Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation [65.52097667738884]
We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation.
Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes.
In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
arXiv Detail & Related papers (2023-08-17T02:51:01Z) - Medical SAM Adapter: Adapting Segment Anything Model for Medical Image
Segmentation [51.770805270588625]
The Segment Anything Model (SAM) has recently gained popularity in the field of image segmentation.
Recent studies and individual experiments have shown that SAM underperforms in medical image segmentation.
We propose the Medical SAM Adapter (Med-SA), which incorporates domain-specific medical knowledge into the segmentation model.
arXiv Detail & Related papers (2023-04-25T07:34:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.