PathLDM: Text conditioned Latent Diffusion Model for Histopathology
- URL: http://arxiv.org/abs/2309.00748v2
- Date: Thu, 30 Nov 2023 20:20:23 GMT
- Title: PathLDM: Text conditioned Latent Diffusion Model for Histopathology
- Authors: Srikar Yellapragada, Alexandros Graikos, Prateek Prasanna, Tahsin
Kurc, Joel Saltz, Dimitris Samaras
- Abstract summary: We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images.
Our approach fuses image and textual data to enhance the generation process.
We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
- Score: 62.970593674481414
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To achieve high-quality results, diffusion models must be trained on large
datasets. This can be notably prohibitive for models in specialized domains,
such as computational pathology. Conditioning on labeled data is known to help
in data-efficient model training. Therefore, histopathology reports, which are
rich in valuable clinical information, are an ideal choice as guidance for a
histopathology generative model. In this paper, we introduce PathLDM, the first
text-conditioned Latent Diffusion Model tailored for generating high-quality
histopathology images. Leveraging the rich contextual information provided by
pathology text reports, our approach fuses image and textual data to enhance
the generation process. By utilizing GPT's capabilities to distill and
summarize complex text reports, we establish an effective conditioning
mechanism. Through strategic conditioning and necessary architectural
enhancements, we achieved a SoTA FID score of 7.64 for text-to-image generation
on the TCGA-BRCA dataset, significantly outperforming the closest
text-conditioned competitor with FID 30.1.
Related papers
- Comparative Analysis of Diffusion Generative Models in Computational Pathology [11.698817924231854]
Diffusion Generative Models (DGM) have rapidly surfaced as emerging topics in the field of computer vision.
This paper presents an in-depth comparative analysis of diffusion methods applied to a pathology dataset.
Our analysis extends to datasets with varying Fields of View (FOV), revealing that DGMs are highly effective in producing high-quality synthetic data.
arXiv Detail & Related papers (2024-11-24T05:09:43Z) - Unleashing the Potential of Synthetic Images: A Study on Histopathology Image Classification [0.12499537119440242]
Histopathology image classification is crucial for the accurate identification and diagnosis of various diseases.
We show that synthetic images can effectively augment existing datasets, ultimately improving the performance of the downstream histopathology image classification task.
arXiv Detail & Related papers (2024-09-24T12:02:55Z) - HistoSPACE: Histology-Inspired Spatial Transcriptome Prediction And Characterization Engine [0.0]
HistoSPACE model explore the diversity of histological images available with ST data to extract molecular insights from tissue image.
Model demonstrates significant efficiency compared to contemporary algorithms, revealing a correlation of 0.56 in leave-one-out cross-validation.
arXiv Detail & Related papers (2024-08-07T07:12:52Z) - Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models [48.87160158792048]
We introduce a cascaded amortized latent diffusion model (CA-LDM) that can synthesis high-resolution OCT volumes in a memory-efficient way.
Experiments on a public high-resolution OCT dataset show that our synthetic data have realistic high-resolution and global features, surpassing the capabilities of existing methods.
arXiv Detail & Related papers (2024-05-26T10:58:22Z) - HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction [16.060286162384536]
HistGen is a learning-empowered framework for histopathology report generation.
It aims to boost report generation by aligning whole slide images (WSIs) and diagnostic reports from local and global granularity.
Experimental results on WSI report generation show the proposed model outperforms state-of-the-art (SOTA) models by a large margin.
arXiv Detail & Related papers (2024-03-08T15:51:43Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - Tertiary Lymphoid Structures Generation through Graph-based Diffusion [54.37503714313661]
In this work, we leverage state-of-the-art graph-based diffusion models to generate biologically meaningful cell-graphs.
We show that the adopted graph diffusion model is able to accurately learn the distribution of cells in terms of their tertiary lymphoid structures (TLS) content.
arXiv Detail & Related papers (2023-10-10T14:37:17Z) - MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images [22.455833806331384]
This paper introduces an innovative methodology for producing high-quality 3D lung CT images guided by textual information.
Current state-of-the-art approaches are limited to low-resolution outputs and underutilize radiology reports' abundant information.
arXiv Detail & Related papers (2023-10-05T14:16:22Z) - An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT [80.33783969507458]
The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians.
Recent studies have achieved promising results in automatic impression generation using large-scale medical text data.
These models often require substantial amounts of medical text data and have poor generalization performance.
arXiv Detail & Related papers (2023-04-17T17:13:42Z) - Fader Networks for domain adaptation on fMRI: ABIDE-II study [68.5481471934606]
We use 3D convolutional autoencoders to build the domain irrelevant latent space image representation and demonstrate this method to outperform existing approaches on ABIDE data.
arXiv Detail & Related papers (2020-10-14T16:50:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.