Related papers: Policy4OOD: A Knowledge-Guided World Model for Policy Intervention Simulation against the Opioid Overdose Crisis

Policy4OOD: A Knowledge-Guided World Model for Policy Intervention Simulation against the Opioid Overdose Crisis

URL: http://arxiv.org/abs/2602.12373v1
Date: Thu, 12 Feb 2026 20:08:49 GMT
Title: Policy4OOD: A Knowledge-Guided World Model for Policy Intervention Simulation against the Opioid Overdose Crisis
Authors: Yijun Ma, Zehong Wang, Weixiang Sun, Zheyuan Zhang, Kaiwen Shi, Nitesh Chawla, Yanfang Ye,
Abstract summary: Opioid epidemic remains one of the most severe public health crises in the United States.<n>We propose a knowledge-guided-temporal world model that addresses three core challenges: what policies, where effects manifest, and when effects unfold.<n>We show that spatial dependencies and structured policy knowledge significantly improve forecasting accuracy.
Score: 22.203336225009778
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The opioid epidemic remains one of the most severe public health crises in the United States, yet evaluating policy interventions before implementation is difficult: multiple policies interact within a dynamic system where targeting one risk pathway may inadvertently amplify another. We argue that effective opioid policy evaluation requires three capabilities -- forecasting future outcomes under current policies, counterfactual reasoning about alternative past decisions, and optimization over candidate interventions -- and propose to unify them through world modeling. We introduce Policy4OOD, a knowledge-guided spatio-temporal world model that addresses three core challenges: what policies prescribe, where effects manifest, and when effects unfold.Policy4OOD jointly encodes policy knowledge graphs, state-level spatial dependencies, and socioeconomic time series into a policy-conditioned Transformer that forecasts future opioid outcomes.Once trained, the world model serves as a simulator: forecasting requires only a forward pass, counterfactual analysis substitutes alternative policy encodings in the historical sequence, and policy optimization employs Monte Carlo Tree Search over the learned simulator. To support this framework, we construct a state-level monthly dataset (2019--2024) integrating opioid mortality, socioeconomic indicators, and structured policy encodings. Experiments demonstrate that spatial dependencies and structured policy knowledge significantly improve forecasting accuracy, validating each architectural component and the potential of world modeling for data-driven public health decision support.

Related papers

Coordinated Pandemic Control with Large Language Model Agents as Policymaking Assistants [51.26321657927398]
We propose a large language model (LLM) multi-agent policymaking framework that supports coordinated and proactive pandemic control across regions.<n>By integrating real-world data, a pandemic evolution simulator, and structured inter-agent communication, our framework enables agents to jointly explore counterfactual intervention scenarios.<n>Compared with real-world pandemic outcomes, our approach reduces cumulative infections and deaths by up to 63.7% and 40.1%, respectively, at the individual state level.
arXiv Detail & Related papers (2026-01-14T07:59:44Z)
LLM-Powered Social Digital Twins: A Framework for Simulating Population Behavioral Response to Policy Interventions [0.2787288702904897]
Social Digital Twins are virtual population replicas where Large Language Models serve as cognitive engines for individual agents.<n>We instantiate this framework in the domain of pandemic response, using COVID-19 as a case study.<n>We discuss implications for policy simulation, limitations of the approach, and directions for extending LLM-based digital twins beyond pandemic response.
arXiv Detail & Related papers (2026-01-03T13:25:33Z)
Beating the Winner's Curse via Inference-Aware Policy Optimization [26.01488014918074]
A common approach is to train a machine learning model to predict counterfactual outcomes, and then select the policy that optimize the predicted objective value.<n>We propose a novel strategy called inference-aware policy optimization, which modifies policy optimization to account for how the policy will be evaluated downstream.
arXiv Detail & Related papers (2025-10-20T23:28:12Z)
Game and Reference: Policy Combination Synthesis for Epidemic Prevention and Control [4.635793210136456]
We present a novel Policy Combination Synthesis (PCS) model for epidemic policy-making. To prevent extreme decisions, we introduce adversarial learning between the model-made policies and the real policies. We also employ contrastive learning to let the model draw on experience from the best historical policies under similar scenarios.
arXiv Detail & Related papers (2024-03-16T00:26:59Z)
Hallucinated Adversarial Control for Conservative Offline Policy Evaluation [64.94009515033984]
We study the problem of conservative off-policy evaluation (COPE) where given an offline dataset of environment interactions, we seek to obtain a (tight) lower bound on a policy's performance. We introduce HAMBO, which builds on an uncertainty-aware learned model of the transition dynamics. We prove that the resulting COPE estimates are valid lower bounds, and, under regularity conditions, show their convergence to the true expected return.
arXiv Detail & Related papers (2023-03-02T08:57:35Z)
POETREE: Interpretable Policy Learning with Adaptive Decision Trees [78.6363825307044]
POETREE is a novel framework for interpretable policy learning. It builds probabilistic tree policies determining physician actions based on patients' observations and medical history. It outperforms the state-of-the-art on real and synthetic medical datasets.
arXiv Detail & Related papers (2022-03-15T16:50:52Z)
Reinforcement Learning with Heterogeneous Data: Estimation and Inference [84.72174994749305]
We introduce the K-Heterogeneous Markov Decision Process (K-Hetero MDP) to address sequential decision problems with population heterogeneity. We propose the Auto-Clustered Policy Evaluation (ACPE) for estimating the value of a given policy, and the Auto-Clustered Policy Iteration (ACPI) for estimating the optimal policy in a given policy class. We present simulations to support our theoretical findings, and we conduct an empirical study on the standard MIMIC-III dataset.
arXiv Detail & Related papers (2022-01-31T20:58:47Z)
Building a Foundation for Data-Driven, Interpretable, and Robust Policy Design using the AI Economist [67.08543240320756]
We show that the AI Economist framework enables effective, flexible, and interpretable policy design using two-level reinforcement learning and data-driven simulations. We find that log-linear policies trained using RL significantly improve social welfare, based on both public health and economic outcomes, compared to past outcomes.
arXiv Detail & Related papers (2021-08-06T01:30:41Z)
Reinforcement Learning for Optimization of COVID-19 Mitigation policies [29.4529156655747]
The year 2020 has seen the COVID-19 virus lead to one of the worst global pandemics in history. Governments around the world are faced with the challenge of protecting public health, while keeping the economy running to the greatest extent possible. Epidemiological models provide insight into the spread of these types of diseases and predict the effects of possible intervention policies.
arXiv Detail & Related papers (2020-10-20T18:40:15Z)
When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and Policy Assessment using Compartmental Gaussian Processes [111.69190108272133]
coronavirus disease 2019 (COVID-19) global pandemic has led many countries to impose unprecedented lockdown measures. Data-driven models that predict COVID-19 fatalities under different lockdown policy scenarios are essential. This paper develops a Bayesian model for predicting the effects of COVID-19 lockdown policies in a global context.
arXiv Detail & Related papers (2020-05-13T18:21:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.