Curriculum Prompting Foundation Models for Medical Image Segmentation
- URL: http://arxiv.org/abs/2409.00695v1
- Date: Sun, 1 Sep 2024 11:00:18 GMT
- Title: Curriculum Prompting Foundation Models for Medical Image Segmentation
- Authors: Xiuqi Zheng, Yuhang Zhang, Haoran Zhang, Hongrui Liang, Xueqi Bao, Zhuqing Jiang, Qicheng Lao,
- Abstract summary: Adapting large pre-trained foundation models, e.g., SAM, for medical image segmentation remains a significant challenge.
Past works have been heavily reliant on a singular type of prompt for each instance, necessitating manual input of an ideally correct prompt.
We propose to utilize prompts of different granularity, which are sourced from original images to provide a broader scope of clinical insights.
In response, we have designed a coarse-to-fine mechanism, referred to as curriculum prompting, that progressively integrates prompts of different types.
- Score: 17.33821260899367
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adapting large pre-trained foundation models, e.g., SAM, for medical image segmentation remains a significant challenge. A crucial step involves the formulation of a series of specialized prompts that incorporate specific clinical instructions. Past works have been heavily reliant on a singular type of prompt for each instance, necessitating manual input of an ideally correct prompt, which is less efficient. To tackle this issue, we propose to utilize prompts of different granularity, which are sourced from original images to provide a broader scope of clinical insights. However, combining prompts of varying types can pose a challenge due to potential conflicts. In response, we have designed a coarse-to-fine mechanism, referred to as curriculum prompting, that progressively integrates prompts of different types. Through extensive experiments on three public medical datasets across various modalities, we demonstrate the effectiveness of our proposed approach, which not only automates the prompt generation process but also yields superior performance compared to other SAM-based medical image segmentation methods. Code is available at: https://github.com/AnnaZzz-zxq/Curriculum-Prompting.
Related papers
- Med-PerSAM: One-Shot Visual Prompt Tuning for Personalized Segment Anything Model in Medical Domain [30.700648813505158]
Leveraging pre-trained models with tailored prompts for in-context learning has proven highly effective in NLP tasks.
We introduce textbfMed-PerSAM, a novel and straightforward one-shot framework designed for the medical domain.
Our model outperforms various foundational models and previous SAM-based approaches across diverse 2D medical imaging datasets.
arXiv Detail & Related papers (2024-11-25T06:16:17Z) - Demystifying Large Language Models for Medicine: A Primer [50.83806796466396]
Large language models (LLMs) represent a transformative class of AI tools capable of revolutionizing various aspects of healthcare.
This tutorial aims to equip healthcare professionals with the tools necessary to effectively integrate LLMs into clinical practice.
arXiv Detail & Related papers (2024-10-24T15:41:56Z) - TP-DRSeg: Improving Diabetic Retinopathy Lesion Segmentation with Explicit Text-Prompts Assisted SAM [13.960042520448646]
We propose a novel framework that customizes SAM for text-prompted Diabetic Retinopathy (DR) lesion segmentation.
Our core idea involves exploiting language cues to inject medical prior knowledge into the vision-only segmentation network.
Specifically, to unleash the potential of vision-language models in the recognition of medical concepts, we propose an explicit prior encoder.
arXiv Detail & Related papers (2024-06-22T07:00:35Z) - Improving Segment Anything on the Fly: Auxiliary Online Learning and Adaptive Fusion for Medical Image Segmentation [52.172885882728174]
In medical imaging contexts, it is not uncommon for human experts to rectify segmentations of specific test samples after SAM generates its segmentation predictions.
We introduce a novel approach that leverages the advantages of online machine learning to enhance Segment Anything (SA) during test time.
We employ rectified annotations to perform online learning, with the aim of improving the segmentation quality of SA on medical images.
arXiv Detail & Related papers (2024-06-03T03:16:25Z) - Medical Visual Prompting (MVP): A Unified Framework for Versatile and High-Quality Medical Image Segmentation [15.460598807078751]
We propose a medical visual prompting (MVP) framework that leverages pre-training and prompting concepts from natural language processing (NLP)
The MVP enables the segmentation network to better learn shape prompting information and facilitates mutual learning across different tasks.
This novel framework offers improved performance with fewer parameters and holds significant potential for accurate segmentation of lesion regions in various medical tasks.
arXiv Detail & Related papers (2024-04-01T14:06:48Z) - EviPrompt: A Training-Free Evidential Prompt Generation Method for
Segment Anything Model in Medical Images [14.899388051854084]
Medical image segmentation has immense clinical applicability but remains a challenge despite advancements in deep learning.
This paper introduces a novel training-free evidential prompt generation method named EviPrompt to overcome these issues.
The proposed method, built on the inherent similarities within medical images, requires only a single reference image-annotation pair.
arXiv Detail & Related papers (2023-11-10T21:22:22Z) - SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation [65.52097667738884]
We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation.
Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes.
In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
arXiv Detail & Related papers (2023-08-17T02:51:01Z) - Self-Prompting Large Vision Models for Few-Shot Medical Image
Segmentation [14.135249795318591]
We propose a novel perspective on self-prompting in medical vision applications.
We harness the embedding space of the Segment Anything Model to prompt itself through a simple yet effective linear pixel-wise classifier.
We achieve competitive results on multiple datasets.
arXiv Detail & Related papers (2023-08-15T08:20:07Z) - Towards Medical Artificial General Intelligence via Knowledge-Enhanced
Multimodal Pretraining [121.89793208683625]
Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks.
We propose a new paradigm called Medical-knedge-enhanced mulTimOdal pretRaining (MOTOR)
arXiv Detail & Related papers (2023-04-26T01:26:19Z) - Ambiguous Medical Image Segmentation using Diffusion Models [60.378180265885945]
We introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights.
Our proposed model generates a distribution of segmentation masks by leveraging the inherent sampling process of diffusion.
Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks.
arXiv Detail & Related papers (2023-04-10T17:58:22Z) - Towards Unifying Medical Vision-and-Language Pre-training via Soft
Prompts [63.84720380390935]
There exist two typical types, textiti.e., the fusion-encoder type and the dual-encoder type, depending on whether a heavy fusion module is used.
We propose an effective yet straightforward scheme named PTUnifier to unify the two types.
We first unify the input format by introducing visual and textual prompts, which serve as a feature bank that stores the most representative images/texts.
arXiv Detail & Related papers (2023-02-17T15:43:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.