Optimal Epidemic Control as a Contextual Combinatorial Bandit with
Budget
- URL: http://arxiv.org/abs/2106.15808v1
- Date: Wed, 30 Jun 2021 04:46:31 GMT
- Title: Optimal Epidemic Control as a Contextual Combinatorial Bandit with
Budget
- Authors: Baihan Lin, Djallel Bouneffouf
- Abstract summary: In light of the COVID-19 pandemic, it is an open challenge and critical practical problem to find a optimal way to prescribe the best policies.
To solve this multi-dimensional tradeoff of exploitation and exploration, we formulate this technical challenge as a contextual bandit problem.
Agent should generate useful intervention plans that policy makers can implement in real time to minimize both the number of daily COVID-19 cases and the stringency of the recommended interventions.
- Score: 26.49683079770031
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In light of the COVID-19 pandemic, it is an open challenge and critical
practical problem to find a optimal way to dynamically prescribe the best
policies that balance both the governmental resources and epidemic control in
different countries and regions. To solve this multi-dimensional tradeoff of
exploitation and exploration, we formulate this technical challenge as a
contextual combinatorial bandit problem that jointly optimizes a multi-criteria
reward function. Given the historical daily cases in a region and the past
intervention plans in place, the agent should generate useful intervention
plans that policy makers can implement in real time to minimizing both the
number of daily COVID-19 cases and the stringency of the recommended
interventions. We prove this concept with simulations of multiple realistic
policy making scenarios.
Related papers
- Coordinated Pandemic Control with Large Language Model Agents as Policymaking Assistants [51.26321657927398]
We propose a large language model (LLM) multi-agent policymaking framework that supports coordinated and proactive pandemic control across regions.<n>By integrating real-world data, a pandemic evolution simulator, and structured inter-agent communication, our framework enables agents to jointly explore counterfactual intervention scenarios.<n>Compared with real-world pandemic outcomes, our approach reduces cumulative infections and deaths by up to 63.7% and 40.1%, respectively, at the individual state level.
arXiv Detail & Related papers (2026-01-14T07:59:44Z) - Thompson Exploration with Best Challenger Rule in Best Arm
Identification [66.33448474838342]
We study the fixed-confidence best arm identification problem in the bandit framework.
We propose a novel policy that combines Thompson sampling with a computationally efficient approach known as the best challenger rule.
arXiv Detail & Related papers (2023-10-01T01:37:02Z) - Reparameterized Policy Learning for Multimodal Trajectory Optimization [61.13228961771765]
We investigate the challenge of parametrizing policies for reinforcement learning in high-dimensional continuous action spaces.
We propose a principled framework that models the continuous RL policy as a generative model of optimal trajectories.
We present a practical model-based RL method, which leverages the multimodal policy parameterization and learned world model.
arXiv Detail & Related papers (2023-07-20T09:05:46Z) - Evaluating COVID-19 vaccine allocation policies using Bayesian $m$-top
exploration [53.122045119395594]
We present a novel technique for evaluating vaccine allocation strategies using a multi-armed bandit framework.
$m$-top exploration allows the algorithm to learn $m$ policies for which it expects the highest utility.
We consider the Belgian COVID-19 epidemic using the individual-based model STRIDE, where we learn a set of vaccination policies.
arXiv Detail & Related papers (2023-01-30T12:22:30Z) - CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies [62.39667564455059]
We consider and study a distribution of optimal policies.
In experimental simulations we show that CAMEO indeed obtains policies that all solve classic control problems.
We further show that the different policies we sample present different risk profiles, corresponding to interesting practical applications in interpretability.
arXiv Detail & Related papers (2022-05-19T09:48:56Z) - Evaluation of non-pharmaceutical interventions and optimal strategies
for containing the COVID-19 pandemic [14.807368322926227]
We investigate associations between policies, mobility patterns, and virus transmission.
Results highlight the power of state of emergency declaration and wearing face masks.
Our framework can be extended to inform policy makers of any country about best practices in pandemic response.
arXiv Detail & Related papers (2022-02-28T17:33:25Z) - Data-driven Optimization Model for Global Covid-19 Intervention Plans [5.565573622844362]
In the wake of COVID-19, every government huddles to find the best interventions that will reduce the number of infection cases while minimizing the economic impact.
We describe an integer programming approach to prescribe intervention plans that optimize for both the minimal number of daily new cases and economic impact.
arXiv Detail & Related papers (2021-04-16T02:56:36Z) - Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds
Globally Optimal Policy [95.98698822755227]
We make the first attempt to study risk-sensitive deep reinforcement learning under the average reward setting with the variance risk criteria.
We propose an actor-critic algorithm that iteratively and efficiently updates the policy, the Lagrange multiplier, and the Fenchel dual variable.
arXiv Detail & Related papers (2020-12-28T05:02:26Z) - Optimal Policies for a Pandemic: A Stochastic Game Approach and a Deep
Learning Algorithm [1.124958340749622]
Game theory has been an effective tool in the control of disease spread and in suggesting optimal policies at both individual and area levels.
We propose a multi-region SEIR model based on differential game theory, aiming to formulate optimal regional policies for infectious diseases.
We apply the proposed model and algorithm to study the COVID-19 pandemic in three states: New York, New Jersey, and Pennsylvania.
arXiv Detail & Related papers (2020-12-12T07:10:46Z) - Multi-Objective Model-based Reinforcement Learning for Infectious
Disease Control [19.022696762983017]
Severe infectious diseases such as the novel coronavirus (COVID-19) pose a huge threat to public health.
Stringent control measures, such as school closures and stay-at-home orders, while having significant effects, also bring huge economic losses.
We propose a Multi-Objective Model-based Reinforcement Learning framework to facilitate data-driven decision-making and minimize the overall long-term cost.
arXiv Detail & Related papers (2020-09-09T23:55:27Z) - When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and
Policy Assessment using Compartmental Gaussian Processes [111.69190108272133]
coronavirus disease 2019 (COVID-19) global pandemic has led many countries to impose unprecedented lockdown measures.
Data-driven models that predict COVID-19 fatalities under different lockdown policy scenarios are essential.
This paper develops a Bayesian model for predicting the effects of COVID-19 lockdown policies in a global context.
arXiv Detail & Related papers (2020-05-13T18:21:50Z) - Variational Policy Propagation for Multi-agent Reinforcement Learning [68.26579560607597]
We propose a emphcollaborative multi-agent reinforcement learning algorithm named variational policy propagation (VPP) to learn a emphjoint policy through the interactions over agents.
We prove that the joint policy is a Markov Random Field under some mild conditions, which in turn reduces the policy space effectively.
We integrate the variational inference as special differentiable layers in policy such as the actions can be efficiently sampled from the Markov Random Field and the overall policy is differentiable.
arXiv Detail & Related papers (2020-04-19T15:42:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.