Related papers: Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning

Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning

URL: http://arxiv.org/abs/2403.06516v1
Date: Mon, 11 Mar 2024 08:43:57 GMT
Title: Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning
Authors: Woojung Han, Chanyoung Kim, Dayun Ju, Yumin Shim, Seong Jae Hwang
Abstract summary: We propose CXRL, a framework motivated by the potential of reinforcement learning (RL) Our framework includes jointly optimizing learnable adaptive condition embeddings (ACE) and the image generator. Our CXRL generates pathologically realistic CXRs, establishing a new standard for generating CXRs.
Score: 5.476136494434766
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advances in text-conditioned image generation diffusion models have begun paving the way for new opportunities in modern medical domain, in particular, generating Chest X-rays (CXRs) from diagnostic reports. Nonetheless, to further drive the diffusion models to generate CXRs that faithfully reflect the complexity and diversity of real data, it has become evident that a nontrivial learning approach is needed. In light of this, we propose CXRL, a framework motivated by the potential of reinforcement learning (RL). Specifically, we integrate a policy gradient RL approach with well-designed multiple distinctive CXR-domain specific reward models. This approach guides the diffusion denoising trajectory, achieving precise CXR posture and pathological details. Here, considering the complex medical image environment, we present "RL with Comparative Feedback" (RLCF) for the reward mechanism, a human-like comparative evaluation that is known to be more effective and reliable in complex scenarios compared to direct evaluation. Our CXRL framework includes jointly optimizing learnable adaptive condition embeddings (ACE) and the image generator, enabling the model to produce more accurate and higher perceptual CXR quality. Our extensive evaluation of the MIMIC-CXR-JPG dataset demonstrates the effectiveness of our RL-based tuning approach. Consequently, our CXRL generates pathologically realistic CXRs, establishing a new standard for generating CXRs with high fidelity to real-world clinical scenarios.

Related papers

Diff-CXR: Report-to-CXR generation through a disease-knowledge enhanced diffusion model [4.507437953126754]
We propose a disease-knowledge enhanced Diffusion-based TTI learning framework, named Diff-CXR, for medical report-to-CXR generation. Experimentally, our Diff-CXR outperforms previous SOTA medical TTI methods by 33.4% / 8.0% and 23.8% / 56.4% in the FID and mAUC score on MIMIC-CXR and IU-Xray.
arXiv Detail & Related papers (2024-10-26T12:38:12Z)
Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation [14.658627367126009]
We propose DDL-CXR, a method that dynamically generates an up-to-date latent representation of the individualized chest X-ray images. Our approach leverages latent diffusion models for patient-specific generation strategically conditioned on a previous CXR image and EHR time series. Experiments using MIMIC datasets show that the proposed model could effectively address asynchronicity in multimodal fusion and consistently outperform existing methods.
arXiv Detail & Related papers (2024-10-23T14:34:39Z)
DiCoM -- Diverse Concept Modeling towards Enhancing Generalizability in Chest X-Ray Studies [6.83819481805979]
Chest X-Ray (CXR) is a widely used clinical imaging modality. Self-supervised pre-training has proven to outperform supervised pre-training in numerous downstream vision tasks. We introduce Diverse Concept Modeling (DiCoM), a novel self-supervised training paradigm.
arXiv Detail & Related papers (2024-02-22T20:51:37Z)
UniChest: Conquer-and-Divide Pre-training for Multi-Source Chest X-Ray Classification [36.94690613164942]
UniChest is a Conquer-and-Divide pre-training framework, aiming to make full use of the collaboration benefit of multiple sources of CXRs. We conduct thorough experiments on many benchmarks, e.g., ChestX-ray14, CheXpert, Vindr-CXR, Shenzhen, Open-I and SIIM-ACR Pneumothorax.
arXiv Detail & Related papers (2023-12-18T09:16:48Z)
DifAugGAN: A Practical Diffusion-style Data Augmentation for GAN-based Single Image Super-resolution [88.13972071356422]
We propose a diffusion-style data augmentation scheme for GAN-based image super-resolution (SR) methods, known as DifAugGAN. It involves adapting the diffusion process in generative diffusion models for improving the calibration of the discriminator during training. Our DifAugGAN can be a Plug-and-Play strategy for current GAN-based SISR methods to improve the calibration of the discriminator and thus improve SR performance.
arXiv Detail & Related papers (2023-11-30T12:37:53Z)
Chest X-ray Image Classification: A Causal Perspective [49.87607548975686]
We propose a causal approach to address the CXR classification problem, which constructs a structural causal model (SCM) and uses the backdoor adjustment to select effective visual information for CXR classification. Experimental results demonstrate that our proposed method outperforms the open-source NIH ChestX-ray14 in terms of classification performance.
arXiv Detail & Related papers (2023-05-20T03:17:44Z)
Vision-Language Generative Model for View-Specific Chest X-ray Generation [18.347723213970696]
ViewXGen is designed to overcome the limitations of existing methods to generate frontal-view chest X-rays. Our approach takes into consideration the diverse view positions found in the dataset, enabling the generation of chest X-rays with specific views.
arXiv Detail & Related papers (2023-02-23T17:13:25Z)
Improving Classification Model Performance on Chest X-Rays through Lung Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations. Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z)
Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data [125.7135706352493]
Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images. Recent studies have shown that training GANs with limited data remains formidable due to discriminator overfitting. This paper introduces a novel strategy called Adaptive Pseudo Augmentation (APA) to encourage healthy competition between the generator and the discriminator.
arXiv Detail & Related papers (2021-11-12T18:13:45Z)
Variational Knowledge Distillation for Disease Classification in Chest X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays. We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.