Related papers: On the Challenges of using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects

On the Challenges of using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects

URL: http://arxiv.org/abs/2301.00512v1
Date: Mon, 2 Jan 2023 03:16:59 GMT
Title: On the Challenges of using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects
Authors: Sumana Basu, Marc-Andr\'e Legault, Adriana Romero-Soriano, Doina Precup
Abstract summary: Two major challenges of using RL for drug dosing are delayed and prolonged effects of administering medications. We propose a simple and effective approach to converting drug dosing PAE-POMDPs into MDPs. We validate the proposed approach on a toy task, and a challenging glucose control task, for which we devise a clinically-inspired reward function.
Score: 42.84123628139412
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Drug dosing is an important application of AI, which can be formulated as a Reinforcement Learning (RL) problem. In this paper, we identify two major challenges of using RL for drug dosing: delayed and prolonged effects of administering medications, which break the Markov assumption of the RL framework. We focus on prolongedness and define PAE-POMDP (Prolonged Action Effect-Partially Observable Markov Decision Process), a subclass of POMDPs in which the Markov assumption does not hold specifically due to prolonged effects of actions. Motivated by the pharmacology literature, we propose a simple and effective approach to converting drug dosing PAE-POMDPs into MDPs, enabling the use of the existing RL algorithms to solve such problems. We validate the proposed approach on a toy task, and a challenging glucose control task, for which we devise a clinically-inspired reward function. Our results demonstrate that: (1) the proposed method to restore the Markov assumption leads to significant improvements over a vanilla baseline; (2) the approach is competitive with recurrent policies which may inherently capture the prolonged effect of actions; (3) it is remarkably more time and memory efficient than the recurrent baseline and hence more suitable for real-time dosing control systems; and (4) it exhibits favorable qualitative behavior in our policy analysis.

Related papers

Case-Based Reasoning Enhances the Predictive Power of LLMs in Drug-Drug Interaction [34.63988064222427]
We propose CBR-DDI, a novel framework that distills pharmacological principles from historical cases to improve DDI tasks.<n>CBR-DDI constructs a knowledge repository by leveraging LLMs to extract pharmacological insights and graph neural networks (GNNs) to model drug associations.<n>Extensive experiments demonstrate that CBR-DDI achieves state-of-the-art performance, with a significant 28.7% accuracy improvement.
arXiv Detail & Related papers (2025-05-29T03:20:53Z)
Physical formula enhanced multi-task learning for pharmacokinetics prediction [54.13787789006417]
A major challenge for AI-driven drug discovery is the scarcity of high-quality data. We develop a formula enhanced mul-ti-task learning (PEMAL) method that predicts four key parameters of pharmacokinetics simultaneously. Our experiments reveal that PEMAL significantly lowers the data demand, compared to typical Graph Neural Networks.
arXiv Detail & Related papers (2024-04-16T07:42:55Z)
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization [52.5587113539404]
We introduce a causality-aware entropy term that effectively identifies and prioritizes actions with high potential impacts for efficient exploration. Our proposed algorithm, ACE: Off-policy Actor-critic with Causality-aware Entropy regularization, demonstrates a substantial performance advantage across 29 diverse continuous control tasks.
arXiv Detail & Related papers (2024-02-22T13:22:06Z)
Modeling Path Importance for Effective Alzheimer's Disease Drug Repurposing [8.153491945775734]
We propose MPI (Modeling Path Importance), a novel network-based method for AD drug repurposing. MPI prioritizes important paths via learned node embeddings, which can effectively capture a network's rich structural information. We observe that among the top-50 ranked drugs, MPI prioritizes 20.0% more drugs with anti-AD evidence compared to the baseline.
arXiv Detail & Related papers (2023-10-23T17:24:11Z)
Zero-shot Learning of Drug Response Prediction for Preclinical Drug Screening [38.94493676651818]
We propose a zero-shot learning solution for the. task in preclinical drug screening. Specifically, we propose a Multi-branch Multi-Source Domain Adaptation Test Enhancement Plug-in, called MSDA.
arXiv Detail & Related papers (2023-10-05T05:55:41Z)
KGML-xDTD: A Knowledge Graph-based Machine Learning Framework for Drug Treatment Prediction and Mechanism Description [11.64859287146094]
We propose KGML-xDTD: a Knowledge Graph-based Machine Learning framework for explainably predicting Drugs Treating Diseases. We leverage knowledge-and-publication based information to extract biologically meaningful "demonstration paths" as the intermediate guidance in the Graph-based Reinforcement Learning process.
arXiv Detail & Related papers (2022-11-30T17:05:22Z)
Graph Regularized Probabilistic Matrix Factorization for Drug-Drug Interactions Prediction [18.659559002642784]
Co-administration of two or more drugs simultaneously can result in adverse drug reactions. Identifying drug-drug interactions (DDIs) is necessary, especially for drug development and for repurposing old drugs. This paper presents a novel Graph Regularized Proabilistic Matrix Factorization (MF) method, which incorporates expert knowledge through a novel graph-based regularization strategy.
arXiv Detail & Related papers (2022-10-19T12:33:06Z)
SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery. wet experiments remain the most reliable method, but they are time-consuming and resource-intensive. Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue. We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z)
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency [61.03922379081648]
We propose an off-policy sample efficient approach that requires no adversarial training or min-max optimization. Our empirical results show that D2-Imitation is effective in achieving good sample efficiency, outperforming several off-policy extension approaches of adversarial imitation.
arXiv Detail & Related papers (2021-12-11T19:36:19Z)
Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost Functions [80.12620331438052]
deep learning has become an important tool for rapid screening of billions of molecules in silico for potential hits containing desired chemical features. Despite its importance, substantial challenges persist in training these models, such as severe class imbalance, high decision thresholds, and lack of ground truth labels in some datasets. We argue in favor of directly optimizing the receiver operating characteristic (ROC) in such cases, due to its robustness to class imbalance.
arXiv Detail & Related papers (2020-06-25T08:46:37Z)
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning [40.94323379769606]
We introduce the notion of action persistence that consists in the repetition of an action for a fixed number of decision steps. We present a novel algorithm, Persistent Fitted Q-Iteration (PFQI), that extends FQI, with the goal of learning the optimal value function at a given persistence.
arXiv Detail & Related papers (2020-02-17T08:38:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.