Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation
- URL: http://arxiv.org/abs/2601.10880v1
- Date: Thu, 15 Jan 2026 22:18:14 GMT
- Title: Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation
- Authors: Chongcong Jiang, Tianxingjian Ding, Chuhan Song, Jiachen Tu, Ziyang Yan, Yihua Shao, Zhenyi Wang, Yuzhang Shang, Tianyu Han, Yu Tian,
- Abstract summary: We present Medical SAM3, a foundation model for universal prompt-driven medical image segmentation.<n>By fine-tuning SAM3's model parameters on 33 datasets spanning 10 medical imaging modalities, Medical SAM3 acquires robust domain-specific representations.<n>Our results establish Medical SAM3 as a universal, text-guided segmentation foundation model for medical imaging.
- Score: 23.51581338204315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Promptable segmentation foundation models such as SAM3 have demonstrated strong generalization capabilities through interactive and concept-based prompting. However, their direct applicability to medical image segmentation remains limited by severe domain shifts, the absence of privileged spatial prompts, and the need to reason over complex anatomical and volumetric structures. Here we present Medical SAM3, a foundation model for universal prompt-driven medical image segmentation, obtained by fully fine-tuning SAM3 on large-scale, heterogeneous 2D and 3D medical imaging datasets with paired segmentation masks and text prompts. Through a systematic analysis of vanilla SAM3, we observe that its performance degrades substantially on medical data, with its apparent competitiveness largely relying on strong geometric priors such as ground-truth-derived bounding boxes. These findings motivate full model adaptation beyond prompt engineering alone. By fine-tuning SAM3's model parameters on 33 datasets spanning 10 medical imaging modalities, Medical SAM3 acquires robust domain-specific representations while preserving prompt-driven flexibility. Extensive experiments across organs, imaging modalities, and dimensionalities demonstrate consistent and significant performance gains, particularly in challenging scenarios characterized by semantic ambiguity, complex morphology, and long-range 3D context. Our results establish Medical SAM3 as a universal, text-guided segmentation foundation model for medical imaging and highlight the importance of holistic model adaptation for achieving robust prompt-driven segmentation under severe domain shift. Code and model will be made available at https://github.com/AIM-Research-Lab/Medical-SAM3.
Related papers
- VesSAM: Efficient Multi-Prompting for Segmenting Complex Vessel [68.24765319399286]
We present VesSAM, a powerful and efficient framework tailored for 2D vessel segmentation.<n>VesSAM integrates (1) a convolutional adapter to enhance local texture features, (2) a multi-prompt encoder that fuses anatomical prompts, and (3) a lightweight mask decoder to reduce jagged artifacts.<n>VesSAM consistently outperforms state-of-the-art PEFT-based SAM variants by over 10% Dice and 13% IoU.
arXiv Detail & Related papers (2025-11-02T15:47:05Z) - Semantic Segmentation of iPS Cells: Case Study on Model Complexity in Biomedical Imaging [0.0]
We show that a carefully configured DeepLabv3 model can achieve high performance in segmenting induced pluripotent stem (iPS) cell colonies.<n>We also offer an open-source implementation that includes strategies for small datasets and domain-specific encoding.
arXiv Detail & Related papers (2025-07-29T09:05:01Z) - TAGS: 3D Tumor-Adaptive Guidance for SAM [20.390209826784087]
We propose an adaptation framework called TAGS: Tumor Adaptive Guidance for SAM.<n>It unlocks 2D FMs for 3D medical tasks through multi-prompt fusion.<n>Our model surpasses the state-of-the-art medical image segmentation models.
arXiv Detail & Related papers (2025-05-21T04:02:17Z) - Few-Shot Adaptation of Training-Free Foundation Model for 3D Medical Image Segmentation [8.78725593323412]
Few-shot Adaptation of Training-frEe SAM (FATE-SAM) is a novel method designed to adapt the advanced Segment Anything Model 2 (SAM2) for 3D medical image segmentation.<n>FATE-SAM reassembles pre-trained modules of SAM2 to enable few-shot adaptation, leveraging a small number of support examples.<n>We evaluate FATE-SAM on multiple medical imaging datasets and compare it with supervised learning methods, zero-shot SAM approaches, and fine-tuned medical SAM methods.
arXiv Detail & Related papers (2025-01-15T20:44:21Z) - MRGen: Segmentation Data Engine for Underrepresented MRI Modalities [59.61465292965639]
Training medical image segmentation models for rare yet clinically important imaging modalities is challenging due to the scarcity of annotated data.<n>This paper investigates leveraging generative models to synthesize data, for training segmentation models for underrepresented modalities.<n>We present MRGen, a data engine for controllable medical image synthesis conditioned on text prompts and segmentation masks.
arXiv Detail & Related papers (2024-12-04T16:34:22Z) - SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation [36.95030121663565]
Supervised Finetuning (SFT) serves as an effective way to adapt foundation models for task-specific downstream tasks.
We propose SAM-Med3D-MoE, a novel framework that seamlessly integrates task-specific finetuned models with the foundational model.
Our experiments demonstrate the efficacy of SAM-Med3D-MoE, with an average Dice performance increase from 53 to 56.4 on 15 specific classes.
arXiv Detail & Related papers (2024-07-06T03:03:45Z) - Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation [48.107348956719775]
We introduce Mask-Enhanced SAM (M-SAM), an innovative architecture tailored for 3D tumor lesion segmentation.
We propose a novel Mask-Enhanced Adapter (MEA) within M-SAM that enriches the semantic information of medical images with positional data from coarse segmentation masks.
Our M-SAM achieves high segmentation accuracy and also exhibits robust generalization.
arXiv Detail & Related papers (2024-03-09T13:37:02Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - SAM-Med3D: Towards General-purpose Segmentation Models for Volumetric Medical Images [35.83393121891959]
We introduce SAM-Med3D for general-purpose segmentation on volumetric medical images.
SAM-Med3D can accurately segment diverse anatomical structures and lesions across various modalities.
Our approach demonstrates that substantial medical resources can be utilized to develop a general-purpose medical AI.
arXiv Detail & Related papers (2023-10-23T17:57:36Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation [11.149807995830255]
Segment Anything Model (SAM) is one of the pioneering prompt-based foundation models for image segmentation.
Recent studies have indicated that SAM, originally designed for 2D natural images, performs suboptimally on 3D medical image segmentation tasks.
We present a novel technique termed AutoProSAM to overcome these challenges.
arXiv Detail & Related papers (2023-08-28T23:23:53Z) - 3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation [52.699139151447945]
We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation.
Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
arXiv Detail & Related papers (2023-06-23T12:09:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.