Promise:Prompt-driven 3D Medical Image Segmentation Using Pretrained
Image Foundation Models
- URL: http://arxiv.org/abs/2310.19721v3
- Date: Mon, 13 Nov 2023 21:28:24 GMT
- Title: Promise:Prompt-driven 3D Medical Image Segmentation Using Pretrained
Image Foundation Models
- Authors: Hao Li, Han Liu, Dewei Hu, Jiacheng Wang, Ipek Oguz
- Abstract summary: We propose ProMISe, a prompt-driven 3D medical image segmentation model using only a single point prompt.
We evaluate our model on two public datasets for colon and pancreas tumor segmentations.
- Score: 13.08275555017179
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To address prevalent issues in medical imaging, such as data acquisition
challenges and label availability, transfer learning from natural to medical
image domains serves as a viable strategy to produce reliable segmentation
results. However, several existing barriers between domains need to be broken
down, including addressing contrast discrepancies, managing anatomical
variability, and adapting 2D pretrained models for 3D segmentation tasks. In
this paper, we propose ProMISe,a prompt-driven 3D medical image segmentation
model using only a single point prompt to leverage knowledge from a pretrained
2D image foundation model. In particular, we use the pretrained vision
transformer from the Segment Anything Model (SAM) and integrate lightweight
adapters to extract depth-related (3D) spatial context without updating the
pretrained weights. For robust results, a hybrid network with complementary
encoders is designed, and a boundary-aware loss is proposed to achieve precise
boundaries. We evaluate our model on two public datasets for colon and pancreas
tumor segmentations, respectively. Compared to the state-of-the-art
segmentation methods with and without prompt engineering, our proposed method
achieves superior performance. The code is publicly available at
https://github.com/MedICL-VU/ProMISe.
Related papers
- Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views [10.944692719150071]
We propose a novel 3D brain segmentation approach using complementary 2D diffusion models.
Our goal is to achieve reliable segmentation quality without requiring complete labels for each individual subject.
arXiv Detail & Related papers (2024-07-17T06:14:53Z) - Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes [2.8498944632323755]
We propose an end-to-end hybrid architecture for medical image segmentation.
We use Hamiltonian Variational Autoencoders (HVAE) and a discriminative regularization to improve the quality of generated images.
Our architecture operates on a slice-by-slice basis to segment 3D volumes, capitilizing on the richly augmented dataset.
arXiv Detail & Related papers (2024-06-17T15:42:08Z) - Self Pre-training with Topology- and Spatiality-aware Masked Autoencoders for 3D Medical Image Segmentation [16.753957522664713]
Masked Autoencoders (MAEs) have been shown to be effective in pre-training Vision Transformers (ViTs) for natural and medical image analysis problems.
Existing MAE pre-training methods, which were specifically developed with the ViT architecture, lack the ability to capture geometric shape and spatial information.
We propose a novel extension of known MAEs for self pre-training (i.e., models pre-trained on the same target dataset) for 3D medical image segmentation.
arXiv Detail & Related papers (2024-06-15T06:15:17Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - 3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation [52.699139151447945]
We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation.
Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
arXiv Detail & Related papers (2023-06-23T12:09:52Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - UNetFormer: A Unified Vision Transformer Model and Pre-Training
Framework for 3D Medical Image Segmentation [14.873473285148853]
We introduce a unified framework consisting of two architectures, dubbed UNetFormer, with a 3D Swin Transformer-based encoder and Conal Neural Network (CNN) and transformer-based decoders.
In the proposed model, the encoder is linked to the decoder via skip connections at five different resolutions with deep supervision.
We present a methodology for self-supervised pre-training of the encoder backbone via learning to predict randomly masked tokens.
arXiv Detail & Related papers (2022-04-01T17:38:39Z) - Unsupervised Domain Adaptation with Contrastive Learning for OCT
Segmentation [49.59567529191423]
We propose a novel semi-supervised learning framework for segmentation of volumetric images from new unlabeled domains.
We jointly use supervised and contrastive learning, also introducing a contrastive pairing scheme that leverages similarity between nearby slices in 3D.
arXiv Detail & Related papers (2022-03-07T19:02:26Z) - Bidirectional RNN-based Few Shot Learning for 3D Medical Image
Segmentation [11.873435088539459]
We propose a 3D few shot segmentation framework for accurate organ segmentation using limited training samples of the target organ annotation.
A U-Net like network is designed to predict segmentation by learning the relationship between 2D slices of support data and a query image.
We evaluate our proposed model using three 3D CT datasets with annotations of different organs.
arXiv Detail & Related papers (2020-11-19T01:44:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.