Diffusion-empowered AutoPrompt MedSAM
- URL: http://arxiv.org/abs/2502.06817v1
- Date: Wed, 05 Feb 2025 03:08:17 GMT
- Title: Diffusion-empowered AutoPrompt MedSAM
- Authors: Peng Huang, Shu Hu, Bo Peng, Jiashu Zhang, Hongtu Zhu, Xi Wu, Xin Wang,
- Abstract summary: We propose AutoMedSAM, an end-to-end framework derived from SAM to enhance usability and segmentation performance.
AutoMedSAM retains MedSAM's image encoder and mask decoder structure while introducing a novel diffusion-based class prompt encoder.
We show that AutoMedSAM achieves superior performance while broadening its applicability to both clinical settings and non-expert users.
- Score: 23.450625203821716
- License:
- Abstract: MedSAM, a medical foundation model derived from the SAM architecture, has demonstrated notable success across diverse medical domains. However, its clinical application faces two major challenges: the dependency on labor-intensive manual prompt generation, which imposes a significant burden on clinicians, and the absence of semantic labeling in the generated segmentation masks for organs or lesions, limiting its practicality for non-expert users. To address these limitations, we propose AutoMedSAM, an end-to-end framework derived from SAM, designed to enhance usability and segmentation performance. AutoMedSAM retains MedSAM's image encoder and mask decoder structure while introducing a novel diffusion-based class prompt encoder. The diffusion-based encoder employs a dual-decoder structure to collaboratively generate prompt embeddings guided by sparse and dense prompt definitions. These embeddings enhance the model's ability to understand and process clinical imagery autonomously. With this encoder, AutoMedSAM leverages class prompts to embed semantic information into the model's predictions, transforming MedSAM's semi-automated pipeline into a fully automated workflow. Furthermore, AutoMedSAM employs an uncertainty-aware joint optimization strategy during training to effectively inherit MedSAM's pre-trained knowledge while improving generalization by integrating multiple loss functions. Experimental results across diverse datasets demonstrate that AutoMedSAM achieves superior performance while broadening its applicability to both clinical settings and non-expert users. Code is available at https://github.com/HP-ML/AutoPromptMedSAM.git.
Related papers
- Learnable Prompting SAM-induced Knowledge Distillation for Semi-supervised Medical Image Segmentation [47.789013598970925]
We propose a learnable prompting SAM-induced Knowledge distillation framework (KnowSAM) for semi-supervised medical image segmentation.
Our model outperforms the state-of-the-art semi-supervised segmentation approaches.
arXiv Detail & Related papers (2024-12-18T11:19:23Z) - SEG-SAM: Semantic-Guided SAM for Unified Medical Image Segmentation [13.037264314135033]
We propose the SEmantic-Guided SAM (SEG-SAM), a unified medical segmentation model that incorporates semantic medical knowledge.
First, to avoid the potential conflict between binary and semantic predictions, we introduce a semantic-aware decoder independent of SAM's original decoder.
We solicit key characteristics of medical categories from large language models and incorporate them into SEG-SAM through a text-to-vision semantic module.
In the end, we introduce the cross-mask spatial alignment strategy to encourage greater overlap between the predicted masks from SEG-SAM's two decoders.
arXiv Detail & Related papers (2024-12-17T08:29:13Z) - ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Medical Image Segmentation [18.388979166848962]
Segment Anything Model (SAM) has demonstrated its potential in both settings.
We propose an efficient self-prompting SAM for universal domain-generalized medical image segmentation, named ESP-MedSAM.
ESP-MedSAM outperforms state-of-the-arts in diverse medical imaging segmentation tasks.
arXiv Detail & Related papers (2024-07-19T09:32:30Z) - ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation.
SAM's Transformer-based structure prioritizes global and low-frequency information.
CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z) - Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding [15.401507589312702]
This paper introduces H-SAM, a prompt-free adaptation of the Segment Anything Model (SAM) for efficient fine-tuning of medical images.
In the initial stage, H-SAM employs SAM's original decoder to generate a prior probabilistic mask, guiding a more intricate decoding process.
Our H-SAM demonstrates a 4.78% improvement in average Dice compared to existing prompt-free SAM variants.
arXiv Detail & Related papers (2024-03-27T05:55:16Z) - SAMCT: Segment Any CT Allowing Labor-Free Task-Indicator Prompts [28.171383990186904]
We construct a large CT dataset consisting of 1.1M CT images and 5M masks from public datasets.
We propose a powerful foundation model SAMCT allowing labor-free prompts.
Based on SAM, SAMCT is further equipped with a CNN image encoder, a cross-branch interaction module, and a task-indicator prompt encoder.
arXiv Detail & Related papers (2024-03-20T02:39:15Z) - UN-SAM: Universal Prompt-Free Segmentation for Generalized Nuclei Images [47.59627416801523]
In digital pathology, precise nuclei segmentation is pivotal yet challenged by the diversity of tissue types, staining protocols, and imaging conditions.
We propose the Universal prompt-free SAM framework for Nuclei segmentation (UN-SAM)
UN-SAM with exceptional performance surpasses state-of-the-arts in nuclei instance and semantic segmentation, especially the generalization capability in zero-shot scenarios.
arXiv Detail & Related papers (2024-02-26T15:35:18Z) - SurgicalPart-SAM: Part-to-Whole Collaborative Prompting for Surgical Instrument Segmentation [66.21356751558011]
The Segment Anything Model (SAM) exhibits promise in generic object segmentation and offers potential for various applications.
Existing methods have applied SAM to surgical instrument segmentation (SIS) by tuning SAM-based frameworks with surgical data.
We propose SurgicalPart-SAM (SP-SAM), a novel SAM efficient-tuning approach that explicitly integrates instrument structure knowledge with SAM's generic knowledge.
arXiv Detail & Related papers (2023-12-22T07:17:51Z) - SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation [65.52097667738884]
We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation.
Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes.
In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
arXiv Detail & Related papers (2023-08-17T02:51:01Z) - AutoSAM: Adapting SAM to Medical Images by Overloading the Prompt
Encoder [101.28268762305916]
In this work, we replace Segment Anything Model with an encoder that operates on the same input image.
We obtain state-of-the-art results on multiple medical images and video benchmarks.
For inspecting the knowledge within it, and providing a lightweight segmentation solution, we also learn to decode it into a mask by a shallow deconvolution network.
arXiv Detail & Related papers (2023-06-10T07:27:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.