Game and Reference: Policy Combination Synthesis for Epidemic Prevention and Control
- URL: http://arxiv.org/abs/2403.10744v1
- Date: Sat, 16 Mar 2024 00:26:59 GMT
- Title: Game and Reference: Policy Combination Synthesis for Epidemic Prevention and Control
- Authors: Zhiyi Tan, Bingkun Bao,
- Abstract summary: We present a novel Policy Combination Synthesis (PCS) model for epidemic policy-making.
To prevent extreme decisions, we introduce adversarial learning between the model-made policies and the real policies.
We also employ contrastive learning to let the model draw on experience from the best historical policies under similar scenarios.
- Score: 4.635793210136456
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, epidemic policy-making models are increasingly being used to provide reference for governors on prevention and control policies against catastrophic epidemics such as SARS, H1N1 and COVID-19. Existing studies are currently constrained by two issues: First, previous methods develop policies based on effect evaluation, since few of factors in real-world decision-making can be modeled, the output policies will then easily become extreme. Second, the subjectivity and cognitive limitation of human make the historical policies not always optimal for the training of decision models. To these ends, we present a novel Policy Combination Synthesis (PCS) model for epidemic policy-making. Specially, to prevent extreme decisions, we introduce adversarial learning between the model-made policies and the real policies to force the output policies to be more human-liked. On the other hand, to minimize the impact of sub-optimal historical policies, we employ contrastive learning to let the model draw on experience from the best historical policies under similar scenarios. Both adversarial and contrastive learning are adaptive based on the comprehensive effects of real policies to ensure the model always learns useful information. Extensive experiments on real-world data prove the effectiveness of the proposed model.
Related papers
- Conformal Off-Policy Evaluation in Markov Decision Processes [53.786439742572995]
Reinforcement Learning aims at identifying and evaluating efficient control policies from data.
Most methods for this learning task, referred to as Off-Policy Evaluation (OPE), do not come with accuracy and certainty guarantees.
We present a novel OPE method based on Conformal Prediction that outputs an interval containing the true reward of the target policy with a prescribed level of certainty.
arXiv Detail & Related papers (2023-04-05T16:45:11Z) - Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality [94.89246810243053]
This paper studies offline policy learning, which aims at utilizing observations collected a priori to learn an optimal individualized decision rule.
Existing policy learning methods rely on a uniform overlap assumption, i.e., the propensities of exploring all actions for all individual characteristics must be lower bounded.
We propose Pessimistic Policy Learning (PPL), a new algorithm that optimize lower confidence bounds (LCBs) instead of point estimates.
arXiv Detail & Related papers (2022-12-19T22:43:08Z) - Counterfactual Learning with General Data-generating Policies [3.441021278275805]
We develop an OPE method for a class of full support and deficient support logging policies in contextual-bandit settings.
We prove that our method's prediction converges in probability to the true performance of a counterfactual policy as the sample size increases.
arXiv Detail & Related papers (2022-12-04T21:07:46Z) - Towards A Unified Policy Abstraction Theory and Representation Learning
Approach in Markov Decision Processes [39.94472154078338]
We propose a unified policy abstraction theory, containing three types of policy abstraction associated to policy features at different levels.
We then generalize them to three policy metrics that quantify the distance (i.e., similarity) of policies.
For the empirical study, we investigate the efficacy of the proposed policy metrics and representations, in characterizing policy difference and conveying policy generalization respectively.
arXiv Detail & Related papers (2022-09-16T03:41:50Z) - Generalizing Off-Policy Learning under Sample Selection Bias [15.733136147164032]
We propose a novel framework for learning policies that generalize to the target population.
We prove that, if the uncertainty set is well-specified, our policies generalize to the target population as they can not do worse than on the training data.
arXiv Detail & Related papers (2021-12-02T16:18:16Z) - Building a Foundation for Data-Driven, Interpretable, and Robust Policy
Design using the AI Economist [67.08543240320756]
We show that the AI Economist framework enables effective, flexible, and interpretable policy design using two-level reinforcement learning and data-driven simulations.
We find that log-linear policies trained using RL significantly improve social welfare, based on both public health and economic outcomes, compared to past outcomes.
arXiv Detail & Related papers (2021-08-06T01:30:41Z) - Supervised Off-Policy Ranking [145.3039527243585]
Off-policy evaluation (OPE) leverages data generated by other policies to evaluate a target policy.
We propose supervised off-policy ranking that learns a policy scoring model by correctly ranking training policies with known performance.
Our method outperforms strong baseline OPE methods in terms of both rank correlation and performance gap between the truly best and the best of the ranked top three policies.
arXiv Detail & Related papers (2021-07-03T07:01:23Z) - Offline Policy Comparison under Limited Historical Agent-Environment
Interactions [0.0]
We address the challenge of policy evaluation in real-world applications of reinforcement learning systems.
We propose that one should perform policy comparison, i.e. to rank the policies of interest in terms of their value based on available historical data.
arXiv Detail & Related papers (2021-06-07T19:51:00Z) - Reinforcement Learning for Optimization of COVID-19 Mitigation policies [29.4529156655747]
The year 2020 has seen the COVID-19 virus lead to one of the worst global pandemics in history.
Governments around the world are faced with the challenge of protecting public health, while keeping the economy running to the greatest extent possible.
Epidemiological models provide insight into the spread of these types of diseases and predict the effects of possible intervention policies.
arXiv Detail & Related papers (2020-10-20T18:40:15Z) - Efficient Evaluation of Natural Stochastic Policies in Offline
Reinforcement Learning [80.42316902296832]
We study the efficient off-policy evaluation of natural policies, which are defined in terms of deviations from the behavior policy.
This is a departure from the literature on off-policy evaluation where most work consider the evaluation of explicitly specified policies.
arXiv Detail & Related papers (2020-06-06T15:08:24Z) - When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and
Policy Assessment using Compartmental Gaussian Processes [111.69190108272133]
coronavirus disease 2019 (COVID-19) global pandemic has led many countries to impose unprecedented lockdown measures.
Data-driven models that predict COVID-19 fatalities under different lockdown policy scenarios are essential.
This paper develops a Bayesian model for predicting the effects of COVID-19 lockdown policies in a global context.
arXiv Detail & Related papers (2020-05-13T18:21:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.