Related papers: Robust Polyp Detection and Diagnosis through Compositional Prompt-Guided Diffusion Models

Robust Polyp Detection and Diagnosis through Compositional Prompt-Guided Diffusion Models

URL: http://arxiv.org/abs/2502.17951v1
Date: Tue, 25 Feb 2025 08:22:45 GMT
Title: Robust Polyp Detection and Diagnosis through Compositional Prompt-Guided Diffusion Models
Authors: Jia Yu, Yan Zhu, Peiyao Fu, Tianyi Chen, Junbo Huang, Quanlin Li, Pinghong Zhou, Zhihua Wang, Fei Wu, Shuo Wang, Xian Yang,
Abstract summary: We propose a Progressive Spectrum Diffusion Model (PSDM) for generating synthetic polyp images.<n>PSDM integrates diverse clinical annotations-such as segmentation masks, bounding boxes, and colonoscopy reports-by transforming them into compositional prompts.<n>By augmenting training data with PSDM-generated samples, our model significantly improves polyp detection, classification, and segmentation.
Score: 32.17651741681871
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Colorectal cancer (CRC) is a significant global health concern, and early detection through screening plays a critical role in reducing mortality. While deep learning models have shown promise in improving polyp detection, classification, and segmentation, their generalization across diverse clinical environments, particularly with out-of-distribution (OOD) data, remains a challenge. Multi-center datasets like PolypGen have been developed to address these issues, but their collection is costly and time-consuming. Traditional data augmentation techniques provide limited variability, failing to capture the complexity of medical images. Diffusion models have emerged as a promising solution for generating synthetic polyp images, but the image generation process in current models mainly relies on segmentation masks as the condition, limiting their ability to capture the full clinical context. To overcome these limitations, we propose a Progressive Spectrum Diffusion Model (PSDM) that integrates diverse clinical annotations-such as segmentation masks, bounding boxes, and colonoscopy reports-by transforming them into compositional prompts. These prompts are organized into coarse and fine components, allowing the model to capture both broad spatial structures and fine details, generating clinically accurate synthetic images. By augmenting training data with PSDM-generated samples, our model significantly improves polyp detection, classification, and segmentation. For instance, on the PolypGen dataset, PSDM increases the F1 score by 2.12% and the mean average precision by 3.09%, demonstrating superior performance in OOD scenarios and enhanced generalization.

Related papers

Diverse Image Generation with Diffusion Models and Cross Class Label Learning for Polyp Classification [4.747649393635696]
We develop a novel model, PathoPolyp-Diff, that generates text-controlled synthetic images with diverse characteristics.<n>We introduce cross-class label learning to make the model learn features from other classes, reducing the burdensome task of data annotation.
arXiv Detail & Related papers (2025-02-08T04:26:20Z)
Polyp-Gen: Realistic and Diverse Polyp Image Generation for Endoscopic Dataset Expansion [35.74618077230043]
We present Polyp-Gen, the first full-automatic diffusion-based endoscopic image generation framework.<n>Specifically, we devise a spatial-aware diffusion training scheme with a lesion-guided loss to enhance the structural context of polyp boundary regions.<n>To capture medical priors for the localization of potential polyp areas, we introduce a hierarchical retrieval-based sampling strategy.
arXiv Detail & Related papers (2025-01-28T03:25:37Z)
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Scaling by training on large datasets has been shown to enhance the quality and fidelity of image generation and manipulation with diffusion models.<n>Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.<n>Our results demonstrate significant performance gains in various scenarios when combined with different fine-tuning schemes.
arXiv Detail & Related papers (2024-12-30T01:59:34Z)
CCIS-Diff: A Generative Model with Stable Diffusion Prior for Controlled Colonoscopy Image Synthesis [7.1892156088672]
We propose a Controlled generative model for high-quality Colonoscopy Image Synthesis based on a Diffusion architecture.<n>Our method offers precise control over both the spatial attributes (polyp location and shape) and clinical characteristics of polyps that align with clinical descriptions.
arXiv Detail & Related papers (2024-11-19T03:30:06Z)
ECC-PolypDet: Enhanced CenterNet with Contrastive Learning for Automatic Polyp Detection [88.4359020192429]
Existing methods either involve computationally expensive context aggregation or lack prior modeling of polyps, resulting in poor performance in challenging cases. In this paper, we propose the Enhanced CenterNet with Contrastive Learning (ECC-PolypDet), a two-stage training & end-to-end inference framework. Box-assisted Contrastive Learning (BCL) during training to minimize the intra-class difference and maximize the inter-class difference between foreground polyps and backgrounds, enabling our model to capture concealed polyps. In the fine-tuning stage, we introduce the IoU-guided Sample Re-weighting
arXiv Detail & Related papers (2024-01-10T07:03:41Z)
ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment. The scarcity of annotated data limits the effectiveness and generalization of existing methods. We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z)
Self-Supervised and Semi-Supervised Polyp Segmentation using Synthetic Data [16.356954231068077]
Early detection of colorectal polyps is of utmost importance for their treatment and for colorectal cancer prevention. Computer vision techniques have the potential to aid professionals in the diagnosis stage, where colonoscopies are manually carried out to examine the entirety of the patient's colon. The main challenge in medical imaging is the lack of data, and a further challenge specific to polyp segmentation approaches is the difficulty of manually labeling the available data. We propose an end-to-end model for polyp segmentation that integrates real and synthetic data to artificially increase the size of the datasets and aid the training when unlabeled samples are available.
arXiv Detail & Related papers (2023-07-22T09:57:58Z)
Mask-conditioned latent diffusion for generating gastrointestinal polyp images [2.027538200191349]
This study proposes a conditional DPM framework to generate synthetic GI polyp images conditioned on given segmentation masks. Our system can generate an unlimited number of high-fidelity synthetic polyp images with the corresponding ground truth masks of polyps. Results show that the best micro-imagewise IOU of 0.7751 was achieved from DeepLabv3+ when the training data consists of both real data and synthetic data.
arXiv Detail & Related papers (2023-04-11T14:11:17Z)
Lesion-aware Dynamic Kernel for Polyp Segmentation [49.63274623103663]
We propose a lesion-aware dynamic network (LDNet) for polyp segmentation. It is a traditional u-shape encoder-decoder structure incorporated with a dynamic kernel generation and updating scheme. This simple but effective scheme endows our model with powerful segmentation performance and generalization capability.
arXiv Detail & Related papers (2023-01-12T09:53:57Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
Co-Heterogeneous and Adaptive Segmentation from Multi-Source and Multi-Phase CT Imaging Data: A Study on Pathological Liver and Lesion Segmentation [48.504790189796836]
We present a novel segmentation strategy, co-heterogenous and adaptive segmentation (CHASe) We propose a versatile framework that fuses appearance based semi-supervision, mask based adversarial domain adaptation, and pseudo-labeling. CHASe can further improve pathological liver mask Dice-Sorensen coefficients by ranges of $4.2% sim 9.4%$.
arXiv Detail & Related papers (2020-05-27T06:58:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.