3DSAM-adapter: Holistic Adaptation of SAM from 2D to 3D for Promptable
Medical Image Segmentation
- URL: http://arxiv.org/abs/2306.13465v1
- Date: Fri, 23 Jun 2023 12:09:52 GMT
- Title: 3DSAM-adapter: Holistic Adaptation of SAM from 2D to 3D for Promptable
Medical Image Segmentation
- Authors: Shizhan Gong, Yuan Zhong, Wenao Ma, Jinpeng Li, Zhao Wang, Jingyang
Zhang, Pheng-Ann Heng, Qi Dou
- Abstract summary: We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation.
Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
- Score: 56.50064853710202
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite that the segment anything model (SAM) achieved impressive results on
general-purpose semantic segmentation with strong generalization ability on
daily images, its demonstrated performance on medical image segmentation is
less precise and not stable, especially when dealing with tumor segmentation
tasks that involve objects of small sizes, irregular shapes, and low contrast.
Notably, the original SAM architecture is designed for 2D natural images,
therefore would not be able to extract the 3D spatial information from
volumetric medical data effectively. In this paper, we propose a novel
adaptation method for transferring SAM from 2D to 3D for promptable medical
image segmentation. Through a holistically designed scheme for architecture
modification, we transfer the SAM to support volumetric inputs while retaining
the majority of its pre-trained parameters for reuse. The fine-tuning process
is conducted in a parameter-efficient manner, wherein most of the pre-trained
parameters remain frozen, and only a few lightweight spatial adapters are
introduced and tuned. Regardless of the domain gap between natural and medical
data and the disparity in the spatial arrangement between 2D and 3D, the
transformer trained on natural images can effectively capture the spatial
patterns present in volumetric medical images with only lightweight
adaptations. We conduct experiments on four open-source tumor segmentation
datasets, and with a single click prompt, our model can outperform domain
state-of-the-art medical image segmentation models on 3 out of 4 tasks,
specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor,
colon cancer segmentation, and achieve similar performance for liver tumor
segmentation. We also compare our adaptation method with existing popular
adapters, and observed significant performance improvement on most datasets.
Related papers
- Novel adaptation of video segmentation to 3D MRI: efficient zero-shot knee segmentation with SAM2 [1.6237741047782823]
We introduce a method for zero-shot, single-prompt segmentation of 3D knee MRI by adapting Segment Anything Model 2.
By treating slices from 3D medical volumes as individual video frames, we leverage SAM2's advanced capabilities to generate motion- and spatially-aware predictions.
We demonstrate that SAM2 can efficiently perform segmentation tasks in a zero-shot manner with no additional training or fine-tuning.
arXiv Detail & Related papers (2024-08-08T21:39:15Z) - Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans [23.573958232965104]
Segment anything model (SAM) demonstrates strong ability generalization on natural image segmentation.
For segmenting 3D radiological CT or MRI scans, a 2D SAM model has to separately handle hundreds of 2D slices.
We propose a comprehensive and scalable 3D SAM model for whole-body CT segmentation, named CT-SAM3D.
arXiv Detail & Related papers (2024-03-22T09:40:52Z) - Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation [48.107348956719775]
We introduce Mask-Enhanced SAM (M-SAM), an innovative architecture tailored for 3D tumor lesion segmentation.
We propose a novel Mask-Enhanced Adapter (MEA) within M-SAM that enriches the semantic information of medical images with positional data from coarse segmentation masks.
Our M-SAM achieves high segmentation accuracy and also exhibits robust generalization.
arXiv Detail & Related papers (2024-03-09T13:37:02Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - Promise:Prompt-driven 3D Medical Image Segmentation Using Pretrained
Image Foundation Models [13.08275555017179]
We propose ProMISe, a prompt-driven 3D medical image segmentation model using only a single point prompt.
We evaluate our model on two public datasets for colon and pancreas tumor segmentations.
arXiv Detail & Related papers (2023-10-30T16:49:03Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - SAM3D: Segment Anything Model in Volumetric Medical Images [11.764867415789901]
We introduce SAM3D, an innovative adaptation tailored for 3D volumetric medical image analysis.
Unlike current SAM-based methods that segment volumetric data by converting the volume into separate 2D slices for individual analysis, our SAM3D model processes the entire 3D volume image in a unified approach.
arXiv Detail & Related papers (2023-09-07T06:05:28Z) - View-Disentangled Transformer for Brain Lesion Detection [50.4918615815066]
We propose a novel view-disentangled transformer to enhance the extraction of MRI features for more accurate tumour detection.
First, the proposed transformer harvests long-range correlation among different positions in a 3D brain scan.
Second, the transformer models a stack of slice features as multiple 2D views and enhance these features view-by-view.
Third, we deploy the proposed transformer module in a transformer backbone, which can effectively detect the 2D regions surrounding brain lesions.
arXiv Detail & Related papers (2022-09-20T11:58:23Z) - Automatic size and pose homogenization with spatial transformer network
to improve and accelerate pediatric segmentation [51.916106055115755]
We propose a new CNN architecture that is pose and scale invariant thanks to the use of Spatial Transformer Network (STN)
Our architecture is composed of three sequential modules that are estimated together during training.
We test the proposed method in kidney and renal tumor segmentation on abdominal pediatric CT scanners.
arXiv Detail & Related papers (2021-07-06T14:50:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.