Planning Multiple Epidemic Interventions with Reinforcement Learning
- URL: http://arxiv.org/abs/2301.12802v3
- Date: Wed, 7 Jun 2023 10:48:02 GMT
- Title: Planning Multiple Epidemic Interventions with Reinforcement Learning
- Authors: Anh Mai and Nikunj Gupta and Azza Abouzied and Dennis Shasha
- Abstract summary: An optimal plan will curb an epidemic with minimal loss of life, disease burden, and economic cost.
Finding an optimal plan is an intractable computational problem in realistic settings.
We apply state-of-the-art actor-critic reinforcement learning algorithms to search for plans that minimize overall costs.
- Score: 7.51289645756884
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Combating an epidemic entails finding a plan that describes when and how to
apply different interventions, such as mask-wearing mandates, vaccinations,
school or workplace closures. An optimal plan will curb an epidemic with
minimal loss of life, disease burden, and economic cost. Finding an optimal
plan is an intractable computational problem in realistic settings.
Policy-makers, however, would greatly benefit from tools that can efficiently
search for plans that minimize disease and economic costs especially when
considering multiple possible interventions over a continuous and complex
action space given a continuous and equally complex state space. We formulate
this problem as a Markov decision process. Our formulation is unique in its
ability to represent multiple continuous interventions over any disease model
defined by ordinary differential equations. We illustrate how to effectively
apply state-of-the-art actor-critic reinforcement learning algorithms (PPO and
SAC) to search for plans that minimize overall costs. We empirically evaluate
the learning performance of these algorithms and compare their performance to
hand-crafted baselines that mimic plans constructed by policy-makers. Our
method outperforms baselines. Our work confirms the viability of a
computational approach to support policy-makers
Related papers
- Learning Logic Specifications for Policy Guidance in POMDPs: an
Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver.
We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications.
We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - Epidemic Control on a Large-Scale-Agent-Based Epidemiology Model using
Deep Deterministic Policy Gradient [0.7244731714427565]
lockdowns, rapid vaccination programs, school closures, and economic stimulus can have positive or unintended negative consequences.
Current research to model and determine an optimal intervention automatically through round-tripping is limited by the simulation objectives, scale (a few thousand individuals), model types that are not suited for intervention studies, and the number of intervention strategies they can explore (discrete vs continuous).
We address these challenges using a Deep Deterministic Policy Gradient (DDPG) based policy optimization framework on a large-scale (100,000 individual) epidemiological agent-based simulation.
arXiv Detail & Related papers (2023-04-10T09:26:07Z) - Policy Optimization for Personalized Interventions in Behavioral Health [8.10897203067601]
Behavioral health interventions, delivered through digital platforms, have the potential to significantly improve health outcomes.
We study the problem of optimizing personalized interventions for patients to maximize a long-term outcome.
We present a new approach for this problem that we dub DecompPI, which decomposes the state space for a system of patients to the individual level.
arXiv Detail & Related papers (2023-03-21T21:42:03Z) - Evaluating COVID-19 vaccine allocation policies using Bayesian $m$-top
exploration [53.122045119395594]
We present a novel technique for evaluating vaccine allocation strategies using a multi-armed bandit framework.
$m$-top exploration allows the algorithm to learn $m$ policies for which it expects the highest utility.
We consider the Belgian COVID-19 epidemic using the individual-based model STRIDE, where we learn a set of vaccination policies.
arXiv Detail & Related papers (2023-01-30T12:22:30Z) - Nearly Optimal Latent State Decoding in Block MDPs [74.51224067640717]
In episodic Block MDPs, the decision maker has access to rich observations or contexts generated from a small number of latent states.
We are first interested in estimating the latent state decoding function based on data generated under a fixed behavior policy.
We then study the problem of learning near-optimal policies in the reward-free framework.
arXiv Detail & Related papers (2022-08-17T18:49:53Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - A Reinforcement Learning Approach to the Stochastic Cutting Stock
Problem [0.0]
We propose a formulation of the cutting stock problem as a discounted infinite-horizon decision process.
An optimal solution corresponds to a policy that associates each state with a decision and minimizes the expected total cost.
arXiv Detail & Related papers (2021-09-20T14:47:54Z) - Multi-Objective Model-based Reinforcement Learning for Infectious
Disease Control [19.022696762983017]
Severe infectious diseases such as the novel coronavirus (COVID-19) pose a huge threat to public health.
Stringent control measures, such as school closures and stay-at-home orders, while having significant effects, also bring huge economic losses.
We propose a Multi-Objective Model-based Reinforcement Learning framework to facilitate data-driven decision-making and minimize the overall long-term cost.
arXiv Detail & Related papers (2020-09-09T23:55:27Z) - Hierarchical Reinforcement Learning for Automatic Disease Diagnosis [52.111516253474285]
We propose to integrate a hierarchical policy structure of two levels into the dialogue systemfor policy learning.
The proposed policy structure is capable to deal with diagnosis problem including large number of diseases and symptoms.
arXiv Detail & Related papers (2020-04-29T15:02:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.