DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training
- URL: http://arxiv.org/abs/2407.11594v1
- Date: Tue, 16 Jul 2024 10:51:21 GMT
- Title: DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training
- Authors: Guillermo Jimenez-Perez, Pedro Osorio, Josef Cersovsky, Javier Montalt-Tordera, Jens Hooge, Steffen Vogler, Sadegh Mohammadi,
- Abstract summary: DiNO-Diffusion is a self-supervised method for training latent diffusion models (LDMs)
By eliminating the reliance on annotations, our training leverages over 868k unlabelled images from public chest X-Ray datasets.
It can be used to generate semantically-diverse synthetic datasets even from small data pools.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Diffusion models (DMs) have emerged as powerful foundation models for a variety of tasks, with a large focus in synthetic image generation. However, their requirement of large annotated datasets for training limits their applicability in medical imaging, where datasets are typically smaller and sparsely annotated. We introduce DiNO-Diffusion, a self-supervised method for training latent diffusion models (LDMs) that conditions the generation process on image embeddings extracted from DiNO. By eliminating the reliance on annotations, our training leverages over 868k unlabelled images from public chest X-Ray (CXR) datasets. Despite being self-supervised, DiNO-Diffusion shows comprehensive manifold coverage, with FID scores as low as 4.7, and emerging properties when evaluated in downstream tasks. It can be used to generate semantically-diverse synthetic datasets even from small data pools, demonstrating up to 20% AUC increase in classification performance when used for data augmentation. Images were generated with different sampling strategies over the DiNO embedding manifold and using real images as a starting point. Results suggest, DiNO-Diffusion could facilitate the creation of large datasets for flexible training of downstream AI models from limited amount of real data, while also holding potential for privacy preservation. Additionally, DiNO-Diffusion demonstrates zero-shot segmentation performance of up to 84.4% Dice score when evaluating lung lobe segmentation. This evidences good CXR image-anatomy alignment, akin to segmenting using textual descriptors on vanilla DMs. Finally, DiNO-Diffusion can be easily adapted to other medical imaging modalities or state-of-the-art diffusion models, opening the door for large-scale, multi-domain image generation pipelines for medical imaging.
Related papers
- Conditional diffusion model with spatial attention and latent embedding for medical image segmentation [2.8703698954661254]
We propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation.
In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones.
We observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms.
arXiv Detail & Related papers (2025-02-10T19:47:28Z) - Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Scaling by training on large datasets has been shown to enhance the quality and fidelity of image generation and manipulation with diffusion models.
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.
Our results demonstrate significant performance gains in various scenarios when combined with different fine-tuning schemes.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities [59.61465292965639]
This paper investigates a new paradigm for leveraging generative models in medical applications.
We propose a diffusion-based data engine, termed MRGen, which enables generation conditioned on text prompts and masks.
arXiv Detail & Related papers (2024-12-04T16:34:22Z) - SAR Image Synthesis with Diffusion Models [0.0]
diffusion models (DMs) have become a popular method for generating synthetic data.
In this work, a specific type of DMs, namely denoising diffusion probabilistic model (DDPM) is adapted to the SAR domain.
We show that DDPM qualitatively and quantitatively outperforms state-of-the-art GAN-based methods for SAR image generation.
arXiv Detail & Related papers (2024-05-13T14:21:18Z) - Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes.
Deep generative models, including diffusion models, are biased towards classes with abundant training images.
We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification.
Our generative approach to classification attains strong results on a variety of benchmarks.
Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z) - Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation [41.608617301275935]
We propose a novel end-to-end framework, called Diff-UNet, for medical volumetric segmentation.
Our approach integrates the diffusion model into a standard U-shaped architecture to extract semantic information from the input volume effectively.
We evaluate our method on three datasets, including multimodal brain tumors in MRI, liver tumors, and multi-organ CT volumes.
arXiv Detail & Related papers (2023-03-18T04:06:18Z) - Fast Unsupervised Brain Anomaly Detection and Segmentation with
Diffusion Models [1.6352599467675781]
We propose a method based on diffusion models to detect and segment anomalies in brain imaging.
Our diffusion models achieve competitive performance compared with autoregressive approaches across a series of experiments with 2D CT and MRI data.
arXiv Detail & Related papers (2022-06-07T17:30:43Z) - Multifold Acceleration of Diffusion MRI via Slice-Interleaved Diffusion
Encoding (SIDE) [50.65891535040752]
We propose a diffusion encoding scheme, called Slice-Interleaved Diffusion.
SIDE, that interleaves each diffusion-weighted (DW) image volume with slices encoded with different diffusion gradients.
We also present a method based on deep learning for effective reconstruction of DW images from the highly slice-undersampled data.
arXiv Detail & Related papers (2020-02-25T14:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.