Is Deep Reinforcement Learning Ready for Practical Applications in
Healthcare? A Sensitivity Analysis of Duel-DDQN for Hemodynamic Management in
Sepsis Patients
- URL: http://arxiv.org/abs/2005.04301v2
- Date: Thu, 27 Aug 2020 14:54:03 GMT
- Title: Is Deep Reinforcement Learning Ready for Practical Applications in
Healthcare? A Sensitivity Analysis of Duel-DDQN for Hemodynamic Management in
Sepsis Patients
- Authors: MingYu Lu and Zachary Shahn and Daby Sow and Finale Doshi-Velez and
Li-wei H. Lehman
- Abstract summary: We perform a sensitivity analysis on a state-of-the-art RL algorithm applied to hemodynamic stabilization treatment strategies for septic patients in the ICU.
We consider sensitivity of learned policies to input features, embedding model architecture, time discretization, reward function, and random seeds.
We find that varying these settings can significantly impact learned policies, which suggests a need for caution when interpreting RL agent output.
- Score: 25.71979754918741
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The potential of Reinforcement Learning (RL) has been demonstrated through
successful applications to games such as Go and Atari. However, while it is
straightforward to evaluate the performance of an RL algorithm in a game
setting by simply using it to play the game, evaluation is a major challenge in
clinical settings where it could be unsafe to follow RL policies in practice.
Thus, understanding sensitivity of RL policies to the host of decisions made
during implementation is an important step toward building the type of trust in
RL required for eventual clinical uptake. In this work, we perform a
sensitivity analysis on a state-of-the-art RL algorithm (Dueling Double Deep
Q-Networks)applied to hemodynamic stabilization treatment strategies for septic
patients in the ICU. We consider sensitivity of learned policies to input
features, embedding model architecture, time discretization, reward function,
and random seeds. We find that varying these settings can significantly impact
learned policies, which suggests a need for caution when interpreting RL agent
output.
Related papers
- OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment [0.4998632546280975]
This study focuses on developing a reward function that reflects the clinician's intentions.
We learn a parameterized reward function that includes the expert's intentions from limited data.
This approach can be broadly utilized not only for the heparin dosing problem but also for RL-based medication dosing tasks in general.
arXiv Detail & Related papers (2024-09-20T07:51:37Z) - Is Value Learning Really the Main Bottleneck in Offline RL? [70.54708989409409]
We show that the choice of a policy extraction algorithm significantly affects the performance and scalability of offline RL.
We propose two simple test-time policy improvement methods and show that these methods lead to better performance.
arXiv Detail & Related papers (2024-06-13T17:07:49Z) - Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care [46.2482873419289]
We introduce a deep Q-learning approach to obtain more reliable critical care policies.
We evaluate our method in off-policy and offline settings using simulated environments and real health records from intensive care units.
arXiv Detail & Related papers (2023-06-13T18:02:57Z) - Decoupled Prioritized Resampling for Offline RL [120.49021589395005]
We propose Offline Prioritized Experience Replay (OPER) for offline reinforcement learning.
OPER features a class of priority functions designed to prioritize highly-rewarding transitions, making them more frequently visited during training.
We show that this class of priority functions induce an improved behavior policy, and when constrained to this improved policy, a policy-constrained offline RL algorithm is likely to yield a better solution.
arXiv Detail & Related papers (2023-06-08T17:56:46Z) - Quasi-optimal Reinforcement Learning with Continuous Actions [8.17049210746654]
We develop a novel emphquasi-optimal learning algorithm, which can be easily optimized in off-policy settings.
We evaluate our algorithm with comprehensive simulated experiments and a dose suggestion real application to Ohio Type 1 diabetes dataset.
arXiv Detail & Related papers (2023-01-21T11:30:13Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation [78.17108227614928]
We propose a benchmark environment for Safe Reinforcement Learning focusing on aquatic navigation.
We consider a value-based and policy-gradient Deep Reinforcement Learning (DRL)
We also propose a verification strategy that checks the behavior of the trained models over a set of desired properties.
arXiv Detail & Related papers (2021-12-16T16:53:56Z) - pH-RL: A personalization architecture to bring reinforcement learning to
health practice [6.587485396428361]
This paper presents pH-RL, a general RL architecture for personalization to bring RL to health practice.
We implement our open-source RL architecture and integrate it with the MoodBuster mobile application for mental health.
Our experimental results show that the developed policies learn to select appropriate actions consistently using only a few days' worth of data.
arXiv Detail & Related papers (2021-03-29T19:38:04Z) - Sample-Efficient Reinforcement Learning via Counterfactual-Based Data
Augmentation [15.451690870640295]
In some scenarios such as healthcare, usually only few records are available for each patient, impeding the application of currentReinforcement learning algorithms.
We propose a data-efficient RL algorithm that exploits structural causal models (SCMs) to model the state dynamics.
We show that counterfactual outcomes are identifiable under mild conditions and that Q- learning on the counterfactual-based augmented data set converges to the optimal value function.
arXiv Detail & Related papers (2020-12-16T17:21:13Z) - Robust Deep Reinforcement Learning against Adversarial Perturbations on
State Observations [88.94162416324505]
A deep reinforcement learning (DRL) agent observes its states through observations, which may contain natural measurement errors or adversarial noises.
Since the observations deviate from the true states, they can mislead the agent into making suboptimal actions.
We show that naively applying existing techniques on improving robustness for classification tasks, like adversarial training, is ineffective for many RL tasks.
arXiv Detail & Related papers (2020-03-19T17:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.