Diff-CXR: Report-to-CXR generation through a disease-knowledge enhanced diffusion model
- URL: http://arxiv.org/abs/2410.20165v1
- Date: Sat, 26 Oct 2024 12:38:12 GMT
- Title: Diff-CXR: Report-to-CXR generation through a disease-knowledge enhanced diffusion model
- Authors: Peng Huang, Bowen Guo, Shuyu Liang, Junhu Fu, Yuanyuan Wang, Yi Guo,
- Abstract summary: We propose a disease-knowledge enhanced Diffusion-based TTI learning framework, named Diff-CXR, for medical report-to-CXR generation.
Experimentally, our Diff-CXR outperforms previous SOTA medical TTI methods by 33.4% / 8.0% and 23.8% / 56.4% in the FID and mAUC score on MIMIC-CXR and IU-Xray.
- Score: 4.507437953126754
- License:
- Abstract: Text-To-Image (TTI) generation is significant for controlled and diverse image generation with broad potential applications. Although current medical TTI methods have made some progress in report-to-Chest-Xray (CXR) generation, their generation performance may be limited due to the intrinsic characteristics of medical data. In this paper, we propose a novel disease-knowledge enhanced Diffusion-based TTI learning framework, named Diff-CXR, for medical report-to-CXR generation. First, to minimize the negative impacts of noisy data on generation, we devise a Latent Noise Filtering Strategy that gradually learns the general patterns of anomalies and removes them in the latent space. Then, an Adaptive Vision-Aware Textual Learning Strategy is designed to learn concise and important report embeddings in a domain-specific Vision-Language Model, providing textual guidance for Chest-Xray generation. Finally, by incorporating the general disease knowledge into the pretrained TTI model via a delicate control adapter, a disease-knowledge enhanced diffusion model is introduced to achieve realistic and precise report-to-CXR generation. Experimentally, our Diff-CXR outperforms previous SOTA medical TTI methods by 33.4\% / 8.0\% and 23.8\% / 56.4\% in the FID and mAUC score on MIMIC-CXR and IU-Xray, with the lowest computational complexity at 29.641 GFLOPs. Downstream experiments on three thorax disease classification benchmarks and one CXR-report generation benchmark demonstrate that Diff-CXR is effective in improving classical CXR analysis methods. Notably, models trained on the combination of 1\% real data and synthetic data can achieve a competitive mAUC score compared to models trained on all data, presenting promising clinical applications.
Related papers
- TRRG: Towards Truthful Radiology Report Generation With Cross-modal Disease Clue Enhanced Large Language Model [22.305034251561835]
We propose a truthful radiology report generation framework, namely TRRG, based on stage-wise training for cross-modal disease clue injection into large language models.
Our proposed framework achieves state-of-the-art performance in radiology report generation on datasets such as IU-Xray and MIMIC-CXR.
arXiv Detail & Related papers (2024-08-22T05:52:27Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images.
Our approach fuses image and textual data to enhance the generation process.
We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
arXiv Detail & Related papers (2023-09-01T22:08:32Z) - Bag of Tricks for Long-Tailed Multi-Label Classification on Chest X-Rays [40.11576642444264]
This report presents a brief description of our solution in the ICCV CVAMD 2023 CXR-LT Competition.
We empirically explored the effectiveness for CXR diagnosis with the integration of several advanced designs.
Our framework finally achieves 0.349 mAP on the competition test set, ranking in the top five.
arXiv Detail & Related papers (2023-08-17T08:25:55Z) - Longitudinal Data and a Semantic Similarity Reward for Chest X-Ray Report Generation [7.586632627817609]
Radiologists face high burnout rates, partly due to the increasing volume of Chest X-rays (CXRs) requiring interpretation and reporting.
Our proposed CXR report generator integrates elements of the workflow and introduces a novel reward for reinforcement learning.
Results from our study demonstrate that the proposed model generates reports that are more aligned with radiologists' reports than state-of-the-art models.
arXiv Detail & Related papers (2023-07-19T05:41:14Z) - Revisiting Computer-Aided Tuberculosis Diagnosis [56.80999479735375]
Tuberculosis (TB) is a major global health threat, causing millions of deaths annually.
Computer-aided tuberculosis diagnosis (CTD) using deep learning has shown promise, but progress is hindered by limited training data.
We establish a large-scale dataset, namely the Tuberculosis X-ray (TBX11K) dataset, which contains 11,200 chest X-ray (CXR) images with corresponding bounding box annotations for TB areas.
This dataset enables the training of sophisticated detectors for high-quality CTD.
arXiv Detail & Related papers (2023-07-06T08:27:48Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - Improving Radiology Report Generation Systems by Removing Hallucinated
References to Non-existent Priors [1.1110995501996481]
We propose two methods to remove references to priors in radiology reports.
A GPT-3-based few-shot approach to rewrite medical reports without references to priors; and a BioBERT-based token classification approach to directly remove words referring to priors.
We find that our re-trained model--which we call CXR-ReDonE--outperforms previous report generation methods on clinical metrics, achieving an average BERTScore of 0.2351 (2.57% absolute improvement)
arXiv Detail & Related papers (2022-09-27T00:44:41Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.