OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
- URL: http://arxiv.org/abs/2409.13299v2
- Date: Tue, 31 Dec 2024 08:27:22 GMT
- Title: OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
- Authors: Yooseok Lim, Sujee Lee,
- Abstract summary: We introduce Offline Model-based Guided Reward Learning (OMG-RL), which performs offline inverse RL (IRL)
We show that OMG-RL policy is positively reinforced not only in terms of the learned reward network but also in activated partial thromboplastin time (aPTT)
This approach can be widely utilized not only for the heparin dosing problem but also for RL-based medication dosing tasks in general.
- Score: 0.4998632546280975
- License:
- Abstract: Accurate medication dosing holds an important position in the overall patient therapeutic process. Therefore, much research has been conducted to develop optimal administration strategy based on Reinforcement learning (RL). However, Relying solely on a few explicitly defined reward functions makes it difficult to learn a treatment strategy that encompasses the diverse characteristics of various patients. Moreover, the multitude of drugs utilized in clinical practice makes it infeasible to construct a dedicated reward function for each medication. Here, we tried to develop a reward network that captures clinicians' therapeutic intentions, departing from explicit rewards, and to derive an optimal heparin dosing policy. In this study, we introduce Offline Model-based Guided Reward Learning (OMG-RL), which performs offline inverse RL (IRL). Through OMG-RL, we learn a parameterized reward function that captures the expert's intentions from limited data, thereby enhancing the agent's policy. We validate the proposed approach on the heparin dosing task. We show that OMG-RL policy is positively reinforced not only in terms of the learned reward network but also in activated partial thromboplastin time (aPTT), a key indicator for monitoring the effects of heparin. This means that the OMG-RL policy adequately reflects clinician's intentions. This approach can be widely utilized not only for the heparin dosing problem but also for RL-based medication dosing tasks in general.
Related papers
- Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm [0.7519918949973486]
This study proposes a reinforcement learning-based personalized optimal heparin dosing policy.
A batch-constrained policy was implemented to minimize out-of-distribution errors in an offline RL environment.
This research enhances heparin administration practices and establishes a precedent for the development of sophisticated decision-support tools in medicine.
arXiv Detail & Related papers (2024-09-24T05:20:38Z) - How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z) - Large Language Model Distilling Medication Recommendation Model [58.94186280631342]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)
Our research aims to transform existing medication recommendation methodologies using LLMs.
To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z) - Leveraging Reward Consistency for Interpretable Feature Discovery in
Reinforcement Learning [69.19840497497503]
It is argued that the commonly used action matching principle is more like an explanation of deep neural networks (DNNs) than the interpretation of RL agents.
We propose to consider rewards, the essential objective of RL agents, as the essential objective of interpreting RL agents.
We verify and evaluate our method on the Atari 2600 games as well as Duckietown, a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2023-09-04T09:09:54Z) - Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care [46.2482873419289]
We introduce a deep Q-learning approach to obtain more reliable critical care policies.
We evaluate our method in off-policy and offline settings using simulated environments and real health records from intensive care units.
arXiv Detail & Related papers (2023-06-13T18:02:57Z) - Deep Offline Reinforcement Learning for Real-world Treatment
Optimization Applications [3.770564448216192]
We introduce a practical and theoretically grounded transition sampling approach to address action imbalance during offline RL training.
We perform extensive experiments on two real-world tasks for diabetes and sepsis treatment optimization.
Across a range of principled and clinically relevant metrics, we show that our proposed approach enables substantial improvements in expected health outcomes.
arXiv Detail & Related papers (2023-02-15T09:30:57Z) - Policy Gradient for Reinforcement Learning with General Utilities [50.65940899590487]
In Reinforcement Learning (RL), the goal of agents is to discover an optimal policy that maximizes the expected cumulative rewards.
Many supervised and unsupervised RL problems are not covered in the Linear RL framework.
We derive the policy gradient theorem for RL with general utilities.
arXiv Detail & Related papers (2022-10-03T14:57:46Z) - Reinforcement Learning For Survival, A Clinically Motivated Method For
Critically Ill Patients [0.0]
We propose a clinically motivated control objective for critically ill patients, for which the value functions have a simple medical interpretation.
We experiment on a large cohort and show that our method produces results consistent with clinical knowledge.
arXiv Detail & Related papers (2022-07-17T00:06:09Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Reinforcement learning and Bayesian data assimilation for model-informed
precision dosing in oncology [0.0]
Current strategies comprise model-informed dosing tables or are based on maximum a-posteriori estimates.
We propose three novel approaches for MIPD employing Bayesian data assimilation and/or reinforcement learning to control neutropenia.
These approaches have the potential to substantially reduce the incidence of life-threatening grade 4 and subtherapeutic grade 0 neutropenia.
arXiv Detail & Related papers (2020-06-01T16:38:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.