Multi-Prompt Fine-Tuning of Foundation Models for Enhanced Medical Image
Segmentation
- URL: http://arxiv.org/abs/2310.02381v1
- Date: Tue, 3 Oct 2023 19:05:00 GMT
- Title: Multi-Prompt Fine-Tuning of Foundation Models for Enhanced Medical Image
Segmentation
- Authors: Xiangru Li, Yifei Zhang, Liang Zhao
- Abstract summary: The Segment Anything Model (SAM) is a powerful foundation model that introduced revolutionary advancements in natural image segmentation.
In this study, we introduce a novel fine-tuning framework that leverages SAM's ability to bundle and process multiple prompts per image.
- Score: 10.946806607643689
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The Segment Anything Model (SAM) is a powerful foundation model that
introduced revolutionary advancements in natural image segmentation. However,
its performance remains sub-optimal when delineating the intricate structure of
biomedical images, where multiple organs and tissues intertwine in a single
image. In this study, we introduce a novel fine-tuning framework that leverages
SAM's ability to bundle and process multiple prompts per image and seeks to
improve SAM's performance in medical images. We first curated a medical image
dataset that consists of CT scans of lesions in various organs, each with two
annotations for organs and lesions respectively. Then, we fine-tuned SAM's mask
decoder within our framework by batching both bounding boxes generated from
ground truth masks as reference. The batched prompt strategy we introduced not
only addresses the inherent complexity and ambiguity often found in medical
images but also substantially enhances performance metrics when applied onto a
wide range of segmentation tasks.
Related papers
- DB-SAM: Delving into High Quality Universal Medical Image Segmentation [100.63434169944853]
We propose a dual-branch adapted SAM framework, named DB-SAM, to bridge the gap between natural and 2D/3D medical data.
Our proposed DB-SAM achieves an absolute gain of 8.8%, compared to a recent medical SAM adapter in the literature.
arXiv Detail & Related papers (2024-10-05T14:36:43Z) - MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation [2.2585213273821716]
We introduce MedCLIP-SAMv2, a novel framework that integrates the CLIP and SAM models to perform segmentation on clinical scans.
Our approach includes fine-tuning the BiomedCLIP model with a new Decoupled Hard Negative Noise Contrastive Estimation (DHN-NCE) loss.
We also investigate using zero-shot segmentation labels within a weakly supervised paradigm to enhance segmentation quality further.
arXiv Detail & Related papers (2024-09-28T23:10:37Z) - CC-SAM: SAM with Cross-feature Attention and Context for Ultrasound Image Segmentation [20.448864959103858]
The Segment Anything Model (SAM) has achieved remarkable successes in the realm of natural image segmentation.
SAM struggles with medical images that feature low contrast, faint boundaries, intricate morphologies, and small-sized objects.
We introduce a comprehensive modification to enhance SAM's performance in the medical domain.
arXiv Detail & Related papers (2024-07-31T22:24:05Z) - MedCLIP-SAM: Bridging Text and Image Towards Universal Medical Image Segmentation [2.2585213273821716]
We propose a novel framework, called MedCLIP-SAM, that combines CLIP and SAM models to generate segmentation of clinical scans.
By extensively testing three diverse segmentation tasks and medical image modalities, our proposed framework has demonstrated excellent accuracy.
arXiv Detail & Related papers (2024-03-29T15:59:11Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - SAM-Med2D [34.82072231983896]
We introduce SAM-Med2D, the most comprehensive studies on applying SAM to medical 2D images.
We first collect and curate approximately 4.6M images and 19.7M masks from public and private datasets.
We fine-tune the encoder and decoder of the original SAM to obtain a well-performed SAM-Med2D.
arXiv Detail & Related papers (2023-08-30T17:59:02Z) - Towards Segment Anything Model (SAM) for Medical Image Segmentation: A
Survey [8.76496233192512]
We discuss efforts to extend the success of the Segment Anything Model to medical image segmentation tasks.
Many insights are drawn to guide future research to develop foundation models for medical image analysis.
arXiv Detail & Related papers (2023-05-05T16:48:45Z) - Medical SAM Adapter: Adapting Segment Anything Model for Medical Image
Segmentation [51.770805270588625]
The Segment Anything Model (SAM) has recently gained popularity in the field of image segmentation.
Recent studies and individual experiments have shown that SAM underperforms in medical image segmentation.
We propose the Medical SAM Adapter (Med-SA), which incorporates domain-specific medical knowledge into the segmentation model.
arXiv Detail & Related papers (2023-04-25T07:34:22Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality.
In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network.
Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.