Conditional diffusion model with spatial attention and latent embedding for medical image segmentation
- URL: http://arxiv.org/abs/2502.06997v2
- Date: Wed, 19 Feb 2025 19:40:15 GMT
- Title: Conditional diffusion model with spatial attention and latent embedding for medical image segmentation
- Authors: Behzad Hejrati, Soumyanil Banerjee, Carri Glide-Hurst, Ming Dong,
- Abstract summary: We propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation.
In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones.
We observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms.
- Score: 2.8703698954661254
- License:
- Abstract: Diffusion models have been used extensively for high quality image and video generation tasks. In this paper, we propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation. In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones. A spatial attention map is computed based on the features learned by the discriminator to help cDAL generate more accurate segmentation of discriminative regions in an input image. Additionally, we incorporated a random latent embedding into each layer of our model to significantly reduce the number of training and sampling time-steps, thereby making it much faster than other diffusion models for image segmentation. We applied cDAL on 3 publicly available medical image segmentation datasets (MoNuSeg, Chest X-ray and Hippocampus) and observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms. The source code is publicly available at https://github.com/Hejrati/cDAL/.
Related papers
- Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Scaling by training on large datasets has been shown to enhance the quality and fidelity of image generation and manipulation with diffusion models.
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.
Our results demonstrate significant performance gains in various scenarios when combined with different fine-tuning schemes.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - Latent Diffusion for Medical Image Segmentation: End to end learning for fast sampling and accuracy [14.545920180010201]
Conditional diffusion in latent space ensures accurate image segmentation for multiple interacting objects.
Our proposed model was significantly more robust to noise compared to traditional deterministic segmentation models.
arXiv Detail & Related papers (2024-07-17T18:44:38Z) - DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training [0.0]
DiNO-Diffusion is a self-supervised method for training latent diffusion models (LDMs)
By eliminating the reliance on annotations, our training leverages over 868k unlabelled images from public chest X-Ray datasets.
It can be used to generate semantically-diverse synthetic datasets even from small data pools.
arXiv Detail & Related papers (2024-07-16T10:51:21Z) - HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation [16.906987804797975]
HiDiff is a hybrid diffusion framework for medical image segmentation.
It can synergize the strengths of existing discriminative segmentation models and new generative diffusion models.
It excels at segmenting small objects and generalizing to new datasets.
arXiv Detail & Related papers (2024-07-03T23:59:09Z) - Learned representation-guided diffusion models for large-image generation [58.192263311786824]
We introduce a novel approach that trains diffusion models conditioned on embeddings from self-supervised learning (SSL)
Our diffusion models successfully project these features back to high-quality histopathology and remote sensing images.
Augmenting real data by generating variations of real images improves downstream accuracy for patch-level and larger, image-scale classification tasks.
arXiv Detail & Related papers (2023-12-12T14:45:45Z) - Introducing Shape Prior Module in Diffusion Model for Medical Image
Segmentation [7.7545714516743045]
We propose an end-to-end framework called VerseDiff-UNet, which leverages the denoising diffusion probabilistic model (DDPM)
Our approach integrates the diffusion model into a standard U-shaped architecture.
We evaluate our method on a single dataset of spine images acquired through X-ray imaging.
arXiv Detail & Related papers (2023-09-12T03:05:00Z) - Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification.
Our generative approach to classification attains strong results on a variety of benchmarks.
Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z) - SDM: Spatial Diffusion Model for Large Hole Image Inpainting [106.90795513361498]
We present a novel spatial diffusion model (SDM) that uses a few iterations to gradually deliver informative pixels to the entire image.
Also, thanks to the proposed decoupled probabilistic modeling and spatial diffusion scheme, our method achieves high-quality large-hole completion.
arXiv Detail & Related papers (2022-12-06T13:30:18Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Weakly-Supervised Segmentation for Disease Localization in Chest X-Ray
Images [0.0]
We propose a novel approach to the semantic segmentation of medical chest X-ray images with only image-level class labels as supervision.
We show that this approach is applicable to chest X-rays for detecting an anomalous volume of air between the lung and the chest wall.
arXiv Detail & Related papers (2020-07-01T20:48:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.