SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation
- URL: http://arxiv.org/abs/2407.04938v1
- Date: Sat, 6 Jul 2024 03:03:45 GMT
- Title: SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation
- Authors: Guoan Wang, Jin Ye, Junlong Cheng, Tianbin Li, Zhaolin Chen, Jianfei Cai, Junjun He, Bohan Zhuang,
- Abstract summary: Supervised Finetuning (SFT) serves as an effective way to adapt foundation models for task-specific downstream tasks.
We propose SAM-Med3D-MoE, a novel framework that seamlessly integrates task-specific finetuned models with the foundational model.
Our experiments demonstrate the efficacy of SAM-Med3D-MoE, with an average Dice performance increase from 53 to 56.4 on 15 specific classes.
- Score: 36.95030121663565
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Volumetric medical image segmentation is pivotal in enhancing disease diagnosis, treatment planning, and advancing medical research. While existing volumetric foundation models for medical image segmentation, such as SAM-Med3D and SegVol, have shown remarkable performance on general organs and tumors, their ability to segment certain categories in clinical downstream tasks remains limited. Supervised Finetuning (SFT) serves as an effective way to adapt such foundation models for task-specific downstream tasks but at the cost of degrading the general knowledge previously stored in the original foundation model.To address this, we propose SAM-Med3D-MoE, a novel framework that seamlessly integrates task-specific finetuned models with the foundational model, creating a unified model at minimal additional training expense for an extra gating network. This gating network, in conjunction with a selection strategy, allows the unified model to achieve comparable performance of the original models in their respective tasks both general and specialized without updating any parameters of them.Our comprehensive experiments demonstrate the efficacy of SAM-Med3D-MoE, with an average Dice performance increase from 53 to 56.4 on 15 specific classes. It especially gets remarkable gains of 29.6, 8.5, 11.2 on the spinal cord, esophagus, and right hip, respectively. Additionally, it achieves 48.9 Dice on the challenging SPPIN2023 Challenge, significantly surpassing the general expert's performance of 32.3. We anticipate that SAM-Med3D-MoE can serve as a new framework for adapting the foundation model to specific areas in medical image analysis. Codes and datasets will be publicly available.
Related papers
- Segment Any Medical Model Extended [39.80956010574076]
We introduce SAMM Extended (SAMME), a platform that integrates new SAM variant models, adopts faster communication protocols, accommodates new interactive modes, and allows for fine-tuning of subcomponents of the models.
These features can expand the potential of foundation models like SAM, and the results can be translated to applications such as image-guided therapy, mixed reality interaction, robotic navigation, and data augmentation.
arXiv Detail & Related papers (2024-03-26T21:37:25Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts [62.55349777609194]
We aim to build up a model that can Segment Anything in radiology scans, driven by Text prompts, termed as SAT.
We build up the largest and most comprehensive segmentation dataset for training, by collecting over 22K 3D medical image scans.
We have trained SAT-Nano (110M parameters) and SAT-Pro (447M parameters) demonstrating comparable performance to 72 specialist nnU-Nets trained on each dataset/subsets.
arXiv Detail & Related papers (2023-12-28T18:16:00Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - Cheap Lunch for Medical Image Segmentation by Fine-tuning SAM on Few
Exemplars [19.725817146049707]
The Segment Anything Model (SAM) has demonstrated remarkable capabilities of scaled-up segmentation models.
However, the adoption of foundational models in the medical domain presents a challenge due to the difficulty and expense of labeling sufficient data.
This paper introduces an efficient and practical approach for fine-tuning SAM using a limited number of exemplars.
arXiv Detail & Related papers (2023-08-27T15:21:25Z) - MedLSAM: Localize and Segment Anything Model for 3D CT Images [14.290321536041816]
We develop a Localize Anything Model for 3D Medical Images (MedLAM)
MedLAM is capable of directly localizing any anatomical structure using just a few template scans.
It has the potential to be seamlessly integrated with future 3D SAM models.
arXiv Detail & Related papers (2023-06-26T15:09:02Z) - 3DSAM-adapter: Holistic Adaptation of SAM from 2D to 3D for Promptable
Medical Image Segmentation [56.50064853710202]
We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation.
Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
arXiv Detail & Related papers (2023-06-23T12:09:52Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - SAM on Medical Images: A Comprehensive Study on Three Prompt Modes [12.42280534113305]
The Segment Anything Model (SAM) made an eye-catching debut recently and inspired many researchers to explore its potential and limitation in terms of zero-shot generalization capability.
In this paper, we evaluate whether SAM has the potential to become the foundation model for medical image segmentation tasks.
We also explore what kind of prompt can lead to the best zero-shot performance with different modalities.
arXiv Detail & Related papers (2023-04-28T18:18:07Z) - Generalist Vision Foundation Models for Medical Imaging: A Case Study of
Segment Anything Model on Zero-Shot Medical Segmentation [5.547422331445511]
We report quantitative and qualitative zero-shot segmentation results on nine medical image segmentation benchmarks.
Our study indicates the versatility of generalist vision foundation models on medical imaging.
arXiv Detail & Related papers (2023-04-25T08:07:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.