Hybrid Control Policy for Artificial Pancreas via Ensemble Deep
Reinforcement Learning
- URL: http://arxiv.org/abs/2307.06501v2
- Date: Fri, 14 Jul 2023 00:31:47 GMT
- Title: Hybrid Control Policy for Artificial Pancreas via Ensemble Deep
Reinforcement Learning
- Authors: Wenzhou Lv, Tianyu Wu, Luolin Xiong, Liang Wu, Jian Zhou, Yang Tang,
Feng Qian
- Abstract summary: We propose a hybrid control policy for the artificial pancreas (HyCPAP) to address the challenges of closed-loop glucose control.
We conduct extensive experiments using the FDA-accepted UVA/Padova T1DM simulator.
Our approaches achieve the highest percentage of time spent in the desired euglycemic range and the lowest occurrences of hypoglycemia.
- Score: 13.783833824324333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Objective: The artificial pancreas (AP) has shown promising potential in
achieving closed-loop glucose control for individuals with type 1 diabetes
mellitus (T1DM). However, designing an effective control policy for the AP
remains challenging due to the complex physiological processes, delayed insulin
response, and inaccurate glucose measurements. While model predictive control
(MPC) offers safety and stability through the dynamic model and safety
constraints, it lacks individualization and is adversely affected by
unannounced meals. Conversely, deep reinforcement learning (DRL) provides
personalized and adaptive strategies but faces challenges with distribution
shifts and substantial data requirements. Methods: We propose a hybrid control
policy for the artificial pancreas (HyCPAP) to address the above challenges.
HyCPAP combines an MPC policy with an ensemble DRL policy, leveraging the
strengths of both policies while compensating for their respective limitations.
To facilitate faster deployment of AP systems in real-world settings, we
further incorporate meta-learning techniques into HyCPAP, leveraging previous
experience and patient-shared knowledge to enable fast adaptation to new
patients with limited available data. Results: We conduct extensive experiments
using the FDA-accepted UVA/Padova T1DM simulator across three scenarios. Our
approaches achieve the highest percentage of time spent in the desired
euglycemic range and the lowest occurrences of hypoglycemia. Conclusion: The
results clearly demonstrate the superiority of our methods for closed-loop
glucose management in individuals with T1DM. Significance: The study presents
novel control policies for AP systems, affirming the great potential of
proposed methods for efficient closed-loop glucose control.
Related papers
- Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction [71.81851971324187]
This work introduces Hierarchical Preference Optimization (HPO), a novel approach to hierarchical reinforcement learning (HRL)
HPO addresses non-stationarity and infeasible subgoal generation issues when solving complex robotic control tasks.
Experiments on challenging robotic navigation and manipulation tasks demonstrate impressive performance of HPO, where it shows an improvement of up to 35% over the baselines.
arXiv Detail & Related papers (2024-11-01T04:58:40Z) - GlucoBench: Curated List of Continuous Glucose Monitoring Datasets with Prediction Benchmarks [0.12564343689544843]
Continuous glucose monitors (CGM) are small medical devices that measure blood glucose levels at regular intervals.
Forecasting of glucose trajectories based on CGM data holds the potential to substantially improve diabetes management.
arXiv Detail & Related papers (2024-10-08T08:01:09Z) - Privacy Preserved Blood Glucose Level Cross-Prediction: An Asynchronous Decentralized Federated Learning Approach [13.363740869325646]
Newly diagnosed Type 1 Diabetes (T1D) patients often struggle to obtain effective Blood Glucose (BG) prediction models.
We propose "GluADFL", blood Glucose prediction by Asynchronous Decentralized Federated Learning.
arXiv Detail & Related papers (2024-06-21T17:57:39Z) - Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization [55.97310586039358]
Diffusion models have garnered widespread attention in Reinforcement Learning (RL) for their powerful expressiveness and multimodality.
We propose a novel model-free diffusion-based online RL algorithm, Q-weighted Variational Policy Optimization (QVPO)
Specifically, we introduce the Q-weighted variational loss, which can be proved to be a tight lower bound of the policy objective in online RL under certain conditions.
We also develop an efficient behavior policy to enhance sample efficiency by reducing the variance of the diffusion policy during online interactions.
arXiv Detail & Related papers (2024-05-25T10:45:46Z) - An Improved Strategy for Blood Glucose Control Using Multi-Step Deep Reinforcement Learning [3.5757761767474876]
Blood Glucose (BG) control involves keeping an individual's BG within a healthy range through extracorporeal insulin injections.
Recent research has been devoted to exploring individualized and automated BG control approaches.
Deep Reinforcement Learning (DRL) shows potential as an emerging approach.
arXiv Detail & Related papers (2024-03-12T11:53:00Z) - Nurse-in-the-Loop Artificial Intelligence for Precision Management of
Type 2 Diabetes in a Clinical Trial Utilizing Transfer-Learned Predictive
Digital Twin [5.521385406191426]
The study developed an online nurse-in-the-loop predictive control (ONLC) model that utilizes a predictive digital twin (PDT)
The PDT was trained on participants self-monitoring data (weight, food logs, physical activity, glucose) from the first three months.
The ONLC provided the intervention group with individualized feedback and recommendations via text messages.
arXiv Detail & Related papers (2024-01-05T06:38:50Z) - Theoretically Guaranteed Policy Improvement Distilled from Model-Based
Planning [64.10794426777493]
Model-based reinforcement learning (RL) has demonstrated remarkable successes on a range of continuous control tasks.
Recent practices tend to distill optimized action sequences into an RL policy during the training phase.
We develop an approach to distill from model-based planning to the policy.
arXiv Detail & Related papers (2023-07-24T16:52:31Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z) - Offline Reinforcement Learning for Safer Blood Glucose Control in People
with Type 1 Diabetes [1.1859913430860336]
Online reinforcement learning (RL) has been utilised as a method for further enhancing glucose control in diabetes devices.
This paper examines the utility of BCQ, CQL and TD3-BC in managing the blood glucose of the 30 virtual patients available within the FDA-approved UVA/Padova glucose dynamics simulator.
offline RL can significantly increase time in the healthy blood glucose range from 61.6 +- 0.3% to 65.3 +/- 0.5% when compared to the strongest state-of-art baseline.
arXiv Detail & Related papers (2022-04-07T11:52:12Z) - Evolutionary Stochastic Policy Distillation [139.54121001226451]
We propose a new method called Evolutionary Policy Distillation (ESPD) to solve GCRS tasks.
ESPD enables a target policy to learn from a series of its variants through the technique of policy distillation (PD)
The experiments based on the MuJoCo control suite show the high learning efficiency of the proposed method.
arXiv Detail & Related papers (2020-04-27T16:19:25Z) - Robust Deep Reinforcement Learning against Adversarial Perturbations on
State Observations [88.94162416324505]
A deep reinforcement learning (DRL) agent observes its states through observations, which may contain natural measurement errors or adversarial noises.
Since the observations deviate from the true states, they can mislead the agent into making suboptimal actions.
We show that naively applying existing techniques on improving robustness for classification tasks, like adversarial training, is ineffective for many RL tasks.
arXiv Detail & Related papers (2020-03-19T17:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.