Related papers: medDreamer: Model-Based Reinforcement Learning with Latent Imagination on Complex EHRs for Clinical Decision Support

medDreamer: Model-Based Reinforcement Learning with Latent Imagination on Complex EHRs for Clinical Decision Support

URL: http://arxiv.org/abs/2505.19785v2
Date: Mon, 04 Aug 2025 13:42:18 GMT
Title: medDreamer: Model-Based Reinforcement Learning with Latent Imagination on Complex EHRs for Clinical Decision Support
Authors: Qianyi Xu, Gousia Habib, Dilruk Perera, Mengling Feng,
Abstract summary: medDreamer is a novel model-based reinforcement learning framework for personalized treatment recommendation.<n>It simulates latent patient states from irregular data and a two-phase policy trained on a hybrid of real and imagined trajectories.<n>It significantly outperforms model-free and model-based baselines in both clinical outcomes and off-policy metrics.
Score: 3.8382507197481144
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Timely and personalized treatment decisions are essential across a wide range of healthcare settings where patient responses can vary significantly and evolve over time. Clinical data used to support these treatment decisions are often irregularly sampled, where missing data frequencies may implicitly convey information about the patient's condition. Existing Reinforcement Learning (RL) based clinical decision support systems often ignore the missing patterns and distort them with coarse discretization and simple imputation. They are also predominantly model-free and largely depend on retrospective data, which could lead to insufficient exploration and bias by historical behaviors. To address these limitations, we propose medDreamer, a novel model-based reinforcement learning framework for personalized treatment recommendation. medDreamer contains a world model with an Adaptive Feature Integration module that simulates latent patient states from irregular data and a two-phase policy trained on a hybrid of real and imagined trajectories. This enables learning optimal policies that go beyond the sub-optimality of historical clinical decisions, while remaining close to real clinical data. We evaluate medDreamer on both sepsis and mechanical ventilation treatment tasks using two large-scale Electronic Health Records (EHRs) datasets. Comprehensive evaluations show that medDreamer significantly outperforms model-free and model-based baselines in both clinical outcomes and off-policy metrics.

Related papers

An Efficient Contrastive Unimodal Pretraining Method for EHR Time Series Data [35.943089444017666]
We propose an efficient method of contrastive pretraining tailored for long clinical timeseries data. Our model demonstrates the ability to impute missing measurements, providing clinicians with deeper insights into patient conditions.
arXiv Detail & Related papers (2024-10-11T19:05:25Z)
How Deep is your Guess? A Fresh Perspective on Deep Learning for Medical Time-Series Imputation [6.547981908229007]
We show how architectural and framework biases combine to influence model performance.<n>Experiments show imputation performance variations of up to 20% based on preprocessing and implementation choices.<n>We identify critical gaps between current deep imputation methods and medical requirements.
arXiv Detail & Related papers (2024-07-11T12:33:28Z)
Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records [1.338174941551702]
This study assesses the capability of the Llama 2 LLM to create synthetic medical records that accurately reflect real patient information. We focus on generating synthetic narratives for the History of Present Illness section, utilising data from the MIMIC-IV dataset for comparison. Our findings suggest that this chain-of-thought prompted approach allows the zero-shot model to achieve results on par with those of fine-tuned models, based on Rouge metrics evaluation.
arXiv Detail & Related papers (2024-03-13T16:17:09Z)
TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials. We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z)
Modelling Patient Trajectories Using Multimodal Information [0.0]
We propose a solution to model patient trajectories that combines different types of information and considers the temporal aspect of clinical data. The developed solution was evaluated on two different clinical outcomes, unexpected patient readmission and disease progression.
arXiv Detail & Related papers (2022-09-09T10:20:54Z)
Bridging the Gap Between Patient-specific and Patient-independent Seizure Prediction via Knowledge Distillation [7.2666838978096875]
Existing approaches typically train models in a patient-specific fashion due to the highly personalized characteristics of epileptic signals. A patient-specific model can then be obtained with the help of distilled knowledge and additional personalized data. Five state-of-the-art seizure prediction methods are trained on the CHB-MIT sEEG database with our proposed scheme.
arXiv Detail & Related papers (2022-02-25T10:30:29Z)
Optimal discharge of patients from intensive care via a data-driven policy learning framework [58.720142291102135]
It is important that the patient discharge task addresses the nuanced trade-off between decreasing a patient's length of stay and the risk of readmission or even death following the discharge decision. This work introduces an end-to-end general framework for capturing this trade-off to recommend optimal discharge timing decisions. A data-driven approach is used to derive a parsimonious, discrete state space representation that captures a patient's physiological condition.
arXiv Detail & Related papers (2021-12-17T04:39:33Z)
The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making. The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z)
Adversarial Sample Enhanced Domain Adaptation: A Case Study on Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation. adversarially generated samples are used during domain adaptation. Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z)
Longitudinal modeling of MS patient trajectories improves predictions of disability progression [2.117653457384462]
This work addresses the task of optimally extracting information from longitudinal patient data in the real-world setting. We show that with machine learning methods suited for patient trajectories modeling, we can predict disability progression of patients in a two-year horizon. Compared to the models available in the literature, this work uses the most complete patient history for MS disease progression prediction.
arXiv Detail & Related papers (2020-11-09T20:48:00Z)
HOLMES: Health OnLine Model Ensemble Serving for Deep Learning Models in Intensive Care Units [31.368873375366213]
HOLMES is an online model ensemble serving framework for healthcare applications. We demonstrate that HOLMES is able to navigate the accuracy/latency tradeoff efficiently, compose the ensemble, and serve the model ensemble pipeline. HOLMES is tested on risk prediction task on pediatric cardio ICU data with above 95% prediction accuracy and sub-second latency on 64-bed simulation.
arXiv Detail & Related papers (2020-08-10T12:38:46Z)
Hemogram Data as a Tool for Decision-making in COVID-19 Management: Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure. This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients. Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret [59.81290762273153]
Dynamic treatment regimes (DTRs) are personalized, adaptive, multi-stage treatment plans that adapt treatment decisions to an individual's initial features and to intermediate outcomes and features at each subsequent stage. We propose a novel algorithm that, by carefully balancing exploration and exploitation, is guaranteed to achieve rate-optimal regret when the transition and reward models are linear.
arXiv Detail & Related papers (2020-05-06T13:03:42Z)
Estimating Counterfactual Treatment Outcomes over Time Through Adversarially Balanced Representations [114.16762407465427]
We introduce the Counterfactual Recurrent Network (CRN) to estimate treatment effects over time. CRN uses domain adversarial training to build balancing representations of the patient history. We show how our model achieves lower error in estimating counterfactuals and in choosing the correct treatment and timing of treatment.
arXiv Detail & Related papers (2020-02-10T20:47:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.