ProGiDiff: Prompt-Guided Diffusion-Based Medical Image Segmentation
- URL: http://arxiv.org/abs/2601.16060v1
- Date: Thu, 22 Jan 2026 15:56:21 GMT
- Title: ProGiDiff: Prompt-Guided Diffusion-Based Medical Image Segmentation
- Authors: Yuan Lin, Murong Xu, Marc Hölle, Chinmay Prabhakar, Andreas Maier, Vasileios Belagiannis, Bjoern Menze, Suprosanna Shit,
- Abstract summary: We propose a novel framework called ProGiDiff that leverages existing image generation models for medical image segmentation purposes.<n> Specifically, we propose a ControlNet-style conditioning mechanism with a custom encoder, suitable for image conditioning, to steer a pre-trained diffusion model to output segmentation masks.<n>Our experiment on organ segmentation from CT images demonstrates strong performance compared to previous methods and could greatly benefit from an expert-in-the-loop setting.
- Score: 12.964514627034122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Widely adopted medical image segmentation methods, although efficient, are primarily deterministic and remain poorly amenable to natural language prompts. Thus, they lack the capability to estimate multiple proposals, human interaction, and cross-modality adaptation. Recently, text-to-image diffusion models have shown potential to bridge the gap. However, training them from scratch requires a large dataset-a limitation for medical image segmentation. Furthermore, they are often limited to binary segmentation and cannot be conditioned on a natural language prompt. To this end, we propose a novel framework called ProGiDiff that leverages existing image generation models for medical image segmentation purposes. Specifically, we propose a ControlNet-style conditioning mechanism with a custom encoder, suitable for image conditioning, to steer a pre-trained diffusion model to output segmentation masks. It naturally extends to a multi-class setting simply by prompting the target organ. Our experiment on organ segmentation from CT images demonstrates strong performance compared to previous methods and could greatly benefit from an expert-in-the-loop setting to leverage multiple proposals. Importantly, we demonstrate that the learned conditioning mechanism can be easily transferred through low-rank, few-shot adaptation to segment MR images.
Related papers
- Diffusion Model in Latent Space for Medical Image Segmentation Task [0.0]
MedSegLatDiff is a diffusion based framework that combines a variational autoencoder (VAE) with a latent diffusion model for efficient medical image segmentation.<n>It achieves state of the art or highly competitive Dice and IoU scores while simultaneously generating diverse segmentation hypotheses and confidence maps.
arXiv Detail & Related papers (2025-12-01T05:26:43Z) - UniSegDiff: Boosting Unified Lesion Segmentation via a Staged Diffusion Model [53.34835793648352]
We propose UniSegDiff, a novel diffusion model framework for lesion segmentation.<n>UniSegDiff addresses lesion segmentation in a unified manner across multiple modalities and organs.<n> Comprehensive experimental results demonstrate that UniSegDiff significantly outperforms previous state-of-the-art (SOTA) approaches.
arXiv Detail & Related papers (2025-07-24T12:33:10Z) - MAMBO-NET: Multi-Causal Aware Modeling Backdoor-Intervention Optimization for Medical Image Segmentation Network [51.68708264694361]
Confusion factors can affect medical images, such as complex anatomical variations and imaging modality limitations.<n>We propose a multi-causal aware modeling backdoor-intervention optimization network for medical image segmentation.<n>Our method significantly reduces the influence of confusion factors, leading to enhanced segmentation accuracy.
arXiv Detail & Related papers (2025-05-28T01:40:10Z) - AutoMiSeg: Automatic Medical Image Segmentation via Test-Time Adaptation of Foundation Models [11.00876772668728]
This paper introduces a zero-shot and automatic segmentation pipeline that combines vision-language and segmentation foundation models.<n>Our pipeline is evaluated on seven diverse medical imaging datasets and shows promising results.
arXiv Detail & Related papers (2025-05-23T14:07:21Z) - Prompting Segment Anything Model with Domain-Adaptive Prototype for Generalizable Medical Image Segmentation [49.5901368256326]
We propose a novel Domain-Adaptive Prompt framework for fine-tuning the Segment Anything Model (termed as DAPSAM) in segmenting medical images.
Our DAPSAM achieves state-of-the-art performance on two medical image segmentation tasks with different modalities.
arXiv Detail & Related papers (2024-09-19T07:28:33Z) - Anatomically-Controllable Medical Image Generation with Segmentation-Guided Diffusion Models [11.835841459200632]
We propose a diffusion model-based method that supports anatomically-controllable medical image generation.
We additionally introduce a random mask ablation training algorithm to enable conditioning on a selected combination of anatomical constraints.
SegGuidedDiff reaches a new state-of-the-art in the faithfulness of generated images to input anatomical masks.
arXiv Detail & Related papers (2024-02-07T19:35:09Z) - I-MedSAM: Implicit Medical Image Segmentation with Segment Anything [24.04558900909617]
We propose I-MedSAM, which leverages the benefits of both continuous representations and SAM to obtain better cross-domain ability and accurate boundary delineation.
Our proposed method with only 1.6M trainable parameters outperforms existing methods including discrete and implicit methods.
arXiv Detail & Related papers (2023-11-28T00:43:52Z) - Self-Prompting Large Vision Models for Few-Shot Medical Image
Segmentation [14.135249795318591]
We propose a novel perspective on self-prompting in medical vision applications.
We harness the embedding space of the Segment Anything Model to prompt itself through a simple yet effective linear pixel-wise classifier.
We achieve competitive results on multiple datasets.
arXiv Detail & Related papers (2023-08-15T08:20:07Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - Ambiguous Medical Image Segmentation using Diffusion Models [60.378180265885945]
We introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights.
Our proposed model generates a distribution of segmentation masks by leveraging the inherent sampling process of diffusion.
Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks.
arXiv Detail & Related papers (2023-04-10T17:58:22Z) - Using Soft Labels to Model Uncertainty in Medical Image Segmentation [0.0]
We propose a simple method to obtain soft labels from the annotations of multiple physicians.
For each image, our method produces a single well-calibrated output that can be thresholded at multiple confidence levels.
We evaluated our method on the MICCAI 2021 QUBIQ challenge, showing that it performs well across multiple medical image segmentation tasks.
arXiv Detail & Related papers (2021-09-26T14:47:18Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.