Related papers: DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training

DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training

URL: http://arxiv.org/abs/2407.11594v1
Date: Tue, 16 Jul 2024 10:51:21 GMT
Title: DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training
Authors: Guillermo Jimenez-Perez, Pedro Osorio, Josef Cersovsky, Javier Montalt-Tordera, Jens Hooge, Steffen Vogler, Sadegh Mohammadi,
Abstract summary: DiNO-Diffusion is a self-supervised method for training latent diffusion models (LDMs) By eliminating the reliance on annotations, our training leverages over 868k unlabelled images from public chest X-Ray datasets. It can be used to generate semantically-diverse synthetic datasets even from small data pools.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Diffusion models (DMs) have emerged as powerful foundation models for a variety of tasks, with a large focus in synthetic image generation. However, their requirement of large annotated datasets for training limits their applicability in medical imaging, where datasets are typically smaller and sparsely annotated. We introduce DiNO-Diffusion, a self-supervised method for training latent diffusion models (LDMs) that conditions the generation process on image embeddings extracted from DiNO. By eliminating the reliance on annotations, our training leverages over 868k unlabelled images from public chest X-Ray (CXR) datasets. Despite being self-supervised, DiNO-Diffusion shows comprehensive manifold coverage, with FID scores as low as 4.7, and emerging properties when evaluated in downstream tasks. It can be used to generate semantically-diverse synthetic datasets even from small data pools, demonstrating up to 20% AUC increase in classification performance when used for data augmentation. Images were generated with different sampling strategies over the DiNO embedding manifold and using real images as a starting point. Results suggest, DiNO-Diffusion could facilitate the creation of large datasets for flexible training of downstream AI models from limited amount of real data, while also holding potential for privacy preservation. Additionally, DiNO-Diffusion demonstrates zero-shot segmentation performance of up to 84.4% Dice score when evaluating lung lobe segmentation. This evidences good CXR image-anatomy alignment, akin to segmenting using textual descriptors on vanilla DMs. Finally, DiNO-Diffusion can be easily adapted to other medical imaging modalities or state-of-the-art diffusion models, opening the door for large-scale, multi-domain image generation pipelines for medical imaging.

Related papers

When Model Knowledge meets Diffusion Model: Diffusion-assisted Data-free Image Synthesis with Alignment of Domain and Class [18.81528537866941]
Open-source pre-trained models hold great potential for diverse applications, but their utility declines when their training data is unavailable.<n>Data-Free Image Synthesis (DFIS) aims to generate images that approximate the learned data distribution of a pre-trained model without accessing the original data.<n>DDIS is the first Diffusion-assisted Data-free Image Synthesis method that leverages a text-to-image diffusion model as a powerful image prior.
arXiv Detail & Related papers (2025-06-18T11:51:40Z)
Anatomy-Grounded Weakly Supervised Prompt Tuning for Chest X-ray Latent Diffusion Models [8.94567513238762]
We show that a standard text-conditioned Latent Diffusion Model has not learned to align clinically relevant information in free-text radiology reports with the corresponding areas of the given scan.<n>We propose a fine-tuning framework to improve multi-modal alignment in a pre-trained model such that it can be efficiently repurposed for downstream tasks such as phrase grounding.
arXiv Detail & Related papers (2025-06-12T12:19:18Z)
Regression is all you need for medical image translation [0.0]
Medical image translation (MIT) can help enhance and supplement existing datasets by generating synthetic images from acquired data.<n>Here, we introduce YODA, a novel 2.5D diffusion-based framework for volumetric MIT.<n>We show that YODA outperforms several state-of-the-art GAN and DM methods.
arXiv Detail & Related papers (2025-05-04T09:57:10Z)
Conditional diffusion model with spatial attention and latent embedding for medical image segmentation [2.8703698954661254]
We propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation. In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones. We observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms.
arXiv Detail & Related papers (2025-02-10T19:47:28Z)
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation. We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z)
MRGen: Segmentation Data Engine for Underrepresented MRI Modalities [59.61465292965639]
Training medical image segmentation models for rare yet clinically important imaging modalities is challenging due to the scarcity of annotated data.<n>This paper investigates leveraging generative models to synthesize data, for training segmentation models for underrepresented modalities.<n>We present MRGen, a data engine for controllable medical image synthesis conditioned on text prompts and segmentation masks.
arXiv Detail & Related papers (2024-12-04T16:34:22Z)
SAR Image Synthesis with Diffusion Models [0.0]
diffusion models (DMs) have become a popular method for generating synthetic data. In this work, a specific type of DMs, namely denoising diffusion probabilistic model (DDPM) is adapted to the SAR domain. We show that DDPM qualitatively and quantitatively outperforms state-of-the-art GAN-based methods for SAR image generation.
arXiv Detail & Related papers (2024-05-13T14:21:18Z)
MEDDAP: Medical Dataset Enhancement via Diversified Augmentation Pipeline [1.4910709350090976]
We introduce a novel pipeline called MEDDAP to augment existing small datasets by automatically generating new informative labeled samples. USLoRA allows for selective fine-tuning of weights within SD, requiring fewer than 0.1% of parameters compared to fully fine-tuning only the UNet portion of SD. This approach is inspired by clinicians' decision-making processes regarding breast tumors, where tumor shape often plays a more crucial role than intensity.
arXiv Detail & Related papers (2024-03-25T00:17:43Z)
Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes. Deep generative models, including diffusion models, are biased towards classes with abundant training images. We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z)
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI) In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion) Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z)
ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment. The scarcity of annotated data limits the effectiveness and generalization of existing methods. We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z)
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models [61.906934570771256]
We present a generic dataset generation model that can produce diverse synthetic images and perception annotations. Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation. We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module.
arXiv Detail & Related papers (2023-08-11T14:38:11Z)
Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification. Our generative approach to classification attains strong results on a variety of benchmarks. Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z)
Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation [41.608617301275935]
We propose a novel end-to-end framework, called Diff-UNet, for medical volumetric segmentation. Our approach integrates the diffusion model into a standard U-shaped architecture to extract semantic information from the input volume effectively. We evaluate our method on three datasets, including multimodal brain tumors in MRI, liver tumors, and multi-organ CT volumes.
arXiv Detail & Related papers (2023-03-18T04:06:18Z)
Analysing the effectiveness of a generative model for semi-supervised medical image segmentation [23.898954721893855]
State-of-the-art in automated segmentation remains supervised learning, employing discriminative models such as U-Net. Semi-supervised learning (SSL) attempts to leverage the abundance of unlabelled data to obtain more robust and reliable models. Deep generative models such as the SemanticGAN are truly viable alternatives to tackle challenging medical image segmentation problems.
arXiv Detail & Related papers (2022-11-03T15:19:59Z)
Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models [1.6352599467675781]
We propose a method based on diffusion models to detect and segment anomalies in brain imaging. Our diffusion models achieve competitive performance compared with autoregressive approaches across a series of experiments with 2D CT and MRI data.
arXiv Detail & Related papers (2022-06-07T17:30:43Z)
Multifold Acceleration of Diffusion MRI via Slice-Interleaved Diffusion Encoding (SIDE) [50.65891535040752]
We propose a diffusion encoding scheme, called Slice-Interleaved Diffusion. SIDE, that interleaves each diffusion-weighted (DW) image volume with slices encoded with different diffusion gradients. We also present a method based on deep learning for effective reconstruction of DW images from the highly slice-undersampled data.
arXiv Detail & Related papers (2020-02-25T14:48:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.