Related papers: Realistic CDSS Drug Dosing with End-to-end Recurrent Q-learning for Dual Vasopressor Control

Realistic CDSS Drug Dosing with End-to-end Recurrent Q-learning for Dual Vasopressor Control

URL: http://arxiv.org/abs/2510.01508v1
Date: Wed, 01 Oct 2025 23:02:00 GMT
Title: Realistic CDSS Drug Dosing with End-to-end Recurrent Q-learning for Dual Vasopressor Control
Authors: Will Y. Zou, Jean Feng, Alexandre Kalimouttou, Jennifer Yuntong Zhang, Christopher W. Seymour, Romain Pirracchio,
Abstract summary: We develop an end-to-end approach for learning optimal drug dosing and control policies for dual vasopressor administration in intensive care unit (ICU) patients with septic shock.<n>For realistic drug dosing, we apply action space design that accommodates discrete, continuous, and directional dosing strategies.<n>The proposed methods achieve improved patient outcomes of over 15% in survival improvement probability, while aligning with established clinical protocols.
Score: 37.613456891934085
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Reinforcement learning (RL) applications in Clinical Decision Support Systems (CDSS) frequently encounter skepticism from practitioners regarding inoperable dosing decisions. We address this challenge with an end-to-end approach for learning optimal drug dosing and control policies for dual vasopressor administration in intensive care unit (ICU) patients with septic shock. For realistic drug dosing, we apply action space design that accommodates discrete, continuous, and directional dosing strategies in a system that combines offline conservative Q-learning with a novel recurrent modeling in a replay buffer to capture temporal dependencies in ICU time-series data. Our comparative analysis of norepinephrine dosing strategies across different action space formulations reveals that the designed action spaces improve interpretability and facilitate clinical adoption while preserving efficacy. Empirical results1 on eICU and MIMIC demonstrate that action space design profoundly influences learned behavioral policies. The proposed methods achieve improved patient outcomes of over 15% in survival improvement probability, while aligning with established clinical protocols.

Related papers

Distribution-Free Uncertainty Quantification in Mechanical Ventilation Treatment: A Conformal Deep Q-Learning Framework [2.5070297884580874]
This study introduces ConformalDQN, a distribution-free conformal deep Q-learning approach for optimizing mechanical ventilation in intensive care units.<n>We trained and evaluated our model using ICU patient records from the MIMIC-IV database.
arXiv Detail & Related papers (2024-12-17T06:55:20Z)
Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm [0.7519918949973486]
This study proposes a reinforcement learning-based personalized optimal heparin dosing policy. A batch-constrained policy was implemented to minimize out-of-distribution errors in an offline RL environment. This research enhances heparin administration practices and establishes a precedent for the development of sophisticated decision-support tools in medicine.
arXiv Detail & Related papers (2024-09-24T05:20:38Z)
Empowering Clinicians with Medical Decision Transformers: A Framework for Sepsis Treatment [5.0005174003014865]
We propose the medical decision transformer (MeDT) to solve tasks in safety-critical settings. MeDT uses the decision transformer architecture to learn a policy for drug dosage recommendation. MeDT captures complex dependencies among a patient's medical history, treatment decisions, outcomes, and short-term effects on stability.
arXiv Detail & Related papers (2024-07-28T03:40:00Z)
Safe and Interpretable Estimation of Optimal Treatment Regimes [54.257304443780434]
We operationalize a safe and interpretable framework to identify optimal treatment regimes. Our findings support personalized treatment strategies based on a patient's medical history and pharmacological features.
arXiv Detail & Related papers (2023-10-23T19:59:10Z)
Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care [46.2482873419289]
We introduce a deep Q-learning approach to obtain more reliable critical care policies. We evaluate our method in off-policy and offline settings using simulated environments and real health records from intensive care units.
arXiv Detail & Related papers (2023-06-13T18:02:57Z)
Learning Optimal Treatment Strategies for Sepsis Using Offline Reinforcement Learning in Continuous Space [4.031538204818658]
We propose a new medical decision model based on historical data to help clinicians recommend the best reference option for real-time treatment. Our model combines offline reinforcement learning with deep reinforcement learning to address the problem that traditional reinforcement learning in healthcare cannot interact with the environment.
arXiv Detail & Related papers (2022-06-22T16:17:21Z)
SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery. wet experiments remain the most reliable method, but they are time-consuming and resource-intensive. Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue. We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z)
Disentangled Counterfactual Recurrent Networks for Treatment Effect Inference over Time [71.30985926640659]
We introduce the Disentangled Counterfactual Recurrent Network (DCRN), a sequence-to-sequence architecture that estimates treatment outcomes over time. With an architecture that is completely inspired by the causal structure of treatment influence over time, we advance forecast accuracy and disease understanding. We demonstrate that DCRN outperforms current state-of-the-art methods in forecasting treatment responses, on both real and simulated data.
arXiv Detail & Related papers (2021-12-07T16:40:28Z)
Real-time landmark detection for precise endoscopic submucosal dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery. We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks. We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z)
Trajectory Inspection: A Method for Iterative Clinician-Driven Design of Reinforcement Learning Studies [5.5302127686575435]
We highlight a simple approach, trajectory inspection, to bring clinicians into an iterative design process for model-based RL studies. We identify where the model recommends unexpectedly aggressive treatments or expects surprisingly positive outcomes from its recommendations.
arXiv Detail & Related papers (2020-10-08T22:03:01Z)
Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects [61.03579766573421]
We study estimation of individual-level causal effects, such as a single patient's response to alternative medication. We devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance. We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances.
arXiv Detail & Related papers (2020-01-21T10:16:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.