A Conservative Q-Learning approach for handling distribution shift in
sepsis treatment strategies
- URL: http://arxiv.org/abs/2203.13884v1
- Date: Fri, 25 Mar 2022 19:50:18 GMT
- Title: A Conservative Q-Learning approach for handling distribution shift in
sepsis treatment strategies
- Authors: Pramod Kaushik, Sneha Kummetha, Perusha Moodley, Raju S. Bapi
- Abstract summary: There is no consensus on what interventions work best and different patients respond very differently to the same treatment.
Deep Reinforcement Learning methods can be used to come up with optimal policies for treatment strategies mirroring physician actions.
The policy learned could help clinicians in Intensive Care Units to make better decisions while treating septic patients and improve survival rate.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Sepsis is a leading cause of mortality and its treatment is very expensive.
Sepsis treatment is also very challenging because there is no consensus on what
interventions work best and different patients respond very differently to the
same treatment. Deep Reinforcement Learning methods can be used to come up with
optimal policies for treatment strategies mirroring physician actions. In the
healthcare scenario, the available data is mostly collected offline with no
interaction with the environment, which necessitates the use of offline RL
techniques. The Offline RL paradigm suffers from action distribution shifts
which in turn negatively affects learning an optimal policy for the treatment.
In this work, a Conservative-Q Learning (CQL) algorithm is used to mitigate
this shift and its corresponding policy reaches closer to the physicians policy
than conventional deep Q Learning. The policy learned could help clinicians in
Intensive Care Units to make better decisions while treating septic patients
and improve survival rate.
Related papers
- Identifying Differential Patient Care Through Inverse Intent Inference [3.4150521058470664]
Sepsis is a life-threatening condition defined by end-organ dysfunction due to a dysregulated host response to infection.
It has been reported in numerous studies that disparities in care exist across the trajectory of patient stay in the emergency department and intensive care unit.
arXiv Detail & Related papers (2024-11-11T21:21:32Z) - Learning Optimal Deterministic Policies with Stochastic Policy Gradients [62.81324245896716]
Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems.
In common practice, convergence (hyper)policies are learned only to deploy their deterministic version.
We show how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.
arXiv Detail & Related papers (2024-05-03T16:45:15Z) - Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline
Reinforcement Learning [57.83919813698673]
Projected Off-Policy Q-Learning (POP-QL) is a novel actor-critic algorithm that simultaneously reweights off-policy samples and constrains the policy to prevent divergence and reduce value-approximation error.
In our experiments, POP-QL not only shows competitive performance on standard benchmarks, but also out-performs competing methods in tasks where the data-collection policy is significantly sub-optimal.
arXiv Detail & Related papers (2023-11-25T00:30:58Z) - Safe and Interpretable Estimation of Optimal Treatment Regimes [54.257304443780434]
We operationalize a safe and interpretable framework to identify optimal treatment regimes.
Our findings support personalized treatment strategies based on a patient's medical history and pharmacological features.
arXiv Detail & Related papers (2023-10-23T19:59:10Z) - Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care [46.2482873419289]
We introduce a deep Q-learning approach to obtain more reliable critical care policies.
We evaluate our method in off-policy and offline settings using simulated environments and real health records from intensive care units.
arXiv Detail & Related papers (2023-06-13T18:02:57Z) - Deep Offline Reinforcement Learning for Real-world Treatment
Optimization Applications [3.770564448216192]
We introduce a practical and theoretically grounded transition sampling approach to address action imbalance during offline RL training.
We perform extensive experiments on two real-world tasks for diabetes and sepsis treatment optimization.
Across a range of principled and clinically relevant metrics, we show that our proposed approach enables substantial improvements in expected health outcomes.
arXiv Detail & Related papers (2023-02-15T09:30:57Z) - Learning Optimal Treatment Strategies for Sepsis Using Offline
Reinforcement Learning in Continuous Space [4.031538204818658]
We propose a new medical decision model based on historical data to help clinicians recommend the best reference option for real-time treatment.
Our model combines offline reinforcement learning with deep reinforcement learning to address the problem that traditional reinforcement learning in healthcare cannot interact with the environment.
arXiv Detail & Related papers (2022-06-22T16:17:21Z) - Optimal discharge of patients from intensive care via a data-driven
policy learning framework [58.720142291102135]
It is important that the patient discharge task addresses the nuanced trade-off between decreasing a patient's length of stay and the risk of readmission or even death following the discharge decision.
This work introduces an end-to-end general framework for capturing this trade-off to recommend optimal discharge timing decisions.
A data-driven approach is used to derive a parsimonious, discrete state space representation that captures a patient's physiological condition.
arXiv Detail & Related papers (2021-12-17T04:39:33Z) - Curriculum Offline Imitation Learning [72.1015201041391]
offline reinforcement learning tasks require the agent to learn from a pre-collected dataset with no further interactions with the environment.
We propose textitCurriculum Offline Learning (COIL), which utilizes an experience picking strategy for imitating from adaptive neighboring policies with a higher return.
On continuous control benchmarks, we compare COIL against both imitation-based and RL-based methods, showing that it not only avoids just learning a mediocre behavior on mixed datasets but is also even competitive with state-of-the-art offline RL methods.
arXiv Detail & Related papers (2021-11-03T08:02:48Z) - Offline reinforcement learning with uncertainty for treatment strategies
in sepsis [0.0]
We present a novel application of reinforcement learning in which we identify optimal recommendations for sepsis treatment from data.
Rather than a single recommendation, our method can present several treatment options.
We examine learned policies and discover that reinforcement learning is biased against aggressive intervention due to the confounding relationship between mortality and level of treatment received.
arXiv Detail & Related papers (2021-07-09T15:29:05Z) - Optimizing Medical Treatment for Sepsis in Intensive Care: from
Reinforcement Learning to Pre-Trial Evaluation [2.908482270923597]
Our aim is to establish a framework where reinforcement learning (RL) of optimizing interventions retrospectively allows us a regulatory compliant pathway to prospective clinical testing of the learned policies.
We focus on infections in intensive care units which are one of the major causes of death and difficult to treat because of the complex and opaque patient dynamics.
arXiv Detail & Related papers (2020-03-13T20:31:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.