Related papers: SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation

SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation

URL: http://arxiv.org/abs/2407.04938v1
Date: Sat, 6 Jul 2024 03:03:45 GMT
Title: SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation
Authors: Guoan Wang, Jin Ye, Junlong Cheng, Tianbin Li, Zhaolin Chen, Jianfei Cai, Junjun He, Bohan Zhuang,
Abstract summary: Supervised Finetuning (SFT) serves as an effective way to adapt foundation models for task-specific downstream tasks. We propose SAM-Med3D-MoE, a novel framework that seamlessly integrates task-specific finetuned models with the foundational model. Our experiments demonstrate the efficacy of SAM-Med3D-MoE, with an average Dice performance increase from 53 to 56.4 on 15 specific classes.
Score: 36.95030121663565
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Volumetric medical image segmentation is pivotal in enhancing disease diagnosis, treatment planning, and advancing medical research. While existing volumetric foundation models for medical image segmentation, such as SAM-Med3D and SegVol, have shown remarkable performance on general organs and tumors, their ability to segment certain categories in clinical downstream tasks remains limited. Supervised Finetuning (SFT) serves as an effective way to adapt such foundation models for task-specific downstream tasks but at the cost of degrading the general knowledge previously stored in the original foundation model.To address this, we propose SAM-Med3D-MoE, a novel framework that seamlessly integrates task-specific finetuned models with the foundational model, creating a unified model at minimal additional training expense for an extra gating network. This gating network, in conjunction with a selection strategy, allows the unified model to achieve comparable performance of the original models in their respective tasks both general and specialized without updating any parameters of them.Our comprehensive experiments demonstrate the efficacy of SAM-Med3D-MoE, with an average Dice performance increase from 53 to 56.4 on 15 specific classes. It especially gets remarkable gains of 29.6, 8.5, 11.2 on the spinal cord, esophagus, and right hip, respectively. Additionally, it achieves 48.9 Dice on the challenging SPPIN2023 Challenge, significantly surpassing the general expert's performance of 32.3. We anticipate that SAM-Med3D-MoE can serve as a new framework for adapting the foundation model to specific areas in medical image analysis. Codes and datasets will be publicly available.

Related papers

MedGemma Technical Report [75.88152277443179]
We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B.<n>MedGemma demonstrates advanced medical understanding and reasoning on images and text.<n>We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP.
arXiv Detail & Related papers (2025-07-07T17:01:44Z)
TAGS: 3D Tumor-Adaptive Guidance for SAM [4.073510647434655]
We propose an adaptation framework called TAGS: Tumor Adaptive Guidance for SAM.<n>It unlocks 2D FMs for 3D medical tasks through multi-prompt fusion.<n>Our model surpasses the state-of-the-art medical image segmentation models.
arXiv Detail & Related papers (2025-05-21T04:02:17Z)
Few-Shot Adaptation of Training-Free Foundation Model for 3D Medical Image Segmentation [8.78725593323412]
Few-shot Adaptation of Training-frEe SAM (FATE-SAM) is a novel method designed to adapt the advanced Segment Anything Model 2 (SAM2) for 3D medical image segmentation. FATE-SAM reassembles pre-trained modules of SAM2 to enable few-shot adaptation, leveraging a small number of support examples. We evaluate FATE-SAM on multiple medical imaging datasets and compare it with supervised learning methods, zero-shot SAM approaches, and fine-tuned medical SAM methods.
arXiv Detail & Related papers (2025-01-15T20:44:21Z)
Repurposing Foundation Model for Generalizable Medical Time Series Classification [16.21546283978257]
FORMED is a foundation classification model that leverages a pre-trained backbone. It can adapt seamlessly to unseen MedTS datasets, regardless of the number of channels, sample lengths, or medical tasks. Our results highlight FORMED as a versatile and scalable model for a wide range of MedTS classification tasks, positioning it as a strong foundation model for future research in MedTS analysis.
arXiv Detail & Related papers (2024-10-03T23:50:04Z)
Towards Evaluating and Building Versatile Large Language Models for Medicine [57.49547766838095]
We present MedS-Bench, a benchmark designed to evaluate the performance of large language models (LLMs) in clinical contexts. MedS-Bench spans 11 high-level clinical tasks, including clinical report summarization, treatment recommendations, diagnosis, named entity recognition, and medical concept explanation. MedS-Ins comprises 58 medically oriented language corpora, totaling 13.5 million samples across 122 tasks.
arXiv Detail & Related papers (2024-08-22T17:01:34Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z)
SAM-Med3D: Towards General-purpose Segmentation Models for Volumetric Medical Images [35.83393121891959]
We introduce SAM-Med3D for general-purpose segmentation on volumetric medical images. SAM-Med3D can accurately segment diverse anatomical structures and lesions across various modalities. Our approach demonstrates that substantial medical resources can be utilized to develop a general-purpose medical AI.
arXiv Detail & Related papers (2023-10-23T17:57:36Z)
MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM. Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments. By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z)
Cheap Lunch for Medical Image Segmentation by Fine-tuning SAM on Few Exemplars [19.725817146049707]
The Segment Anything Model (SAM) has demonstrated remarkable capabilities of scaled-up segmentation models. However, the adoption of foundational models in the medical domain presents a challenge due to the difficulty and expense of labeling sufficient data. This paper introduces an efficient and practical approach for fine-tuning SAM using a limited number of exemplars.
arXiv Detail & Related papers (2023-08-27T15:21:25Z)
3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation [52.699139151447945]
We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation. Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
arXiv Detail & Related papers (2023-06-23T12:09:52Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
SAM on Medical Images: A Comprehensive Study on Three Prompt Modes [12.42280534113305]
The Segment Anything Model (SAM) made an eye-catching debut recently and inspired many researchers to explore its potential and limitation in terms of zero-shot generalization capability. In this paper, we evaluate whether SAM has the potential to become the foundation model for medical image segmentation tasks. We also explore what kind of prompt can lead to the best zero-shot performance with different modalities.
arXiv Detail & Related papers (2023-04-28T18:18:07Z)
Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation [5.547422331445511]
We report quantitative and qualitative zero-shot segmentation results on nine medical image segmentation benchmarks. Our study indicates the versatility of generalist vision foundation models on medical imaging.
arXiv Detail & Related papers (2023-04-25T08:07:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.