Related papers: Prioritizing emergency evacuations under compounding levels of uncertainty

Prioritizing emergency evacuations under compounding levels of uncertainty

URL: http://arxiv.org/abs/2210.08975v1
Date: Fri, 30 Sep 2022 21:01:05 GMT
Title: Prioritizing emergency evacuations under compounding levels of uncertainty
Authors: Lisa J. Einstein, Robert J. Moss, Mykel J. Kochenderfer
Abstract summary: We propose and analyze a decision support tool for pre-crisis training exercises for teams preparing for civilian evacuations. We use different classes of Markov decision processes (MDPs) to capture compounding levels of uncertainty. We show that accounting for the compounding levels of model uncertainty incurs added complexity without improvement in policy performance.
Score: 34.71695000650056
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Well-executed emergency evacuations can save lives and reduce suffering. However, decision makers struggle to determine optimal evacuation policies given the chaos, uncertainty, and value judgments inherent in emergency evacuations. We propose and analyze a decision support tool for pre-crisis training exercises for teams preparing for civilian evacuations and explore the tool in the case of the 2021 U.S.-led evacuation from Afghanistan. We use different classes of Markov decision processes (MDPs) to capture compounding levels of uncertainty in (1) the priority category of who appears next at the gate for evacuation, (2) the distribution of priority categories at the population level, and (3) individuals' claimed priority category. We compare the number of people evacuated by priority status under eight heuristic policies. The optimized MDP policy achieves the best performance compared to all heuristic baselines. We also show that accounting for the compounding levels of model uncertainty incurs added complexity without improvement in policy performance. Useful heuristics can be extracted from the optimized policies to inform human decision makers. We open-source all tools to encourage robust dialogue about the trade-offs, limitations, and potential of integrating algorithms into high-stakes humanitarian decision-making.

Related papers

Towards a Cascaded LLM Framework for Cost-effective Human-AI Decision-Making [55.2480439325792]
We present a cascaded LLM decision framework that adaptively delegates tasks across multiple tiers of expertise.<n>First, a deferral policy determines whether to accept the base model's answer or regenerate it with the large model.<n>Second, an abstention policy decides whether the cascade model response is sufficiently certain or requires human intervention.
arXiv Detail & Related papers (2025-06-13T15:36:22Z)
Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes [59.27926064817273]
We introduce an exploration-agnostic algorithm, called C-PG, which enjoys global last-iterate convergence guarantees under domination assumptions.<n>We empirically validate both the action-based (C-PGAE) and parameter-based (C-PGPE) variants of C-PG on constrained control tasks.
arXiv Detail & Related papers (2025-06-06T10:29:05Z)
Global-Decision-Focused Neural ODEs for Proactive Grid Resilience Management [50.34345101758248]
We propose predict-all-then-optimize-globally (PATOG), a framework that integrates outage prediction with globally optimized interventions. Our approach ensures spatially and temporally coherent decision-making, improving both predictive accuracy and operational efficiency. Experiments on synthetic and real-world datasets demonstrate significant improvements in outage prediction consistency and grid resilience.
arXiv Detail & Related papers (2025-02-25T16:15:35Z)
Decision Theoretic Foundations for Conformal Prediction: Optimal Uncertainty Quantification for Risk-Averse Agents [24.938391962245877]
We develop decision-theoretic foundations that connect uncertainty using prediction sets with risk-averse decision-making. We experimentally demonstrate the significant advantages of Risk-Averse (RAC) in applications such as medical diagnosis and recommendation systems.
arXiv Detail & Related papers (2025-02-04T18:37:10Z)
Random Policy Enables In-Context Reinforcement Learning within Trust Horizons [2.52299400625445]
State-Action Distillation (SAD) generates an effective pretraining dataset guided solely by random policies. SAD outperforms the best baseline by 236.3% in the offline evaluation and by 135.2% in the online evaluation.
arXiv Detail & Related papers (2024-10-25T21:46:25Z)
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models [94.39278422567955]
Fine-tuning large language models (LLMs) on human preferences has proven successful in enhancing their capabilities. However, ensuring the safety of LLMs during the fine-tuning remains a critical concern. We propose a supervised learning framework called Bi-Factorial Preference Optimization (BFPO) to address this issue.
arXiv Detail & Related papers (2024-08-27T17:31:21Z)
Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs [82.34567890576423]
We develop a deterministic policy gradient primal-dual method to find an optimal deterministic policy with non-asymptotic convergence. We prove that the primal-dual iterates of D-PGPD converge at a sub-linear rate to an optimal regularized primal-dual pair. To the best of our knowledge, this appears to be the first work that proposes a deterministic policy search method for continuous-space constrained MDPs.
arXiv Detail & Related papers (2024-08-19T14:11:04Z)
Optimal Transport-Assisted Risk-Sensitive Q-Learning [4.14360329494344]
This paper presents a risk-sensitive Q-learning algorithm that leverages optimal transport theory to enhance the agent safety. We validate the proposed algorithm in a Gridworld environment.
arXiv Detail & Related papers (2024-06-17T17:32:25Z)
Matchings, Predictions and Counterfactual Harm in Refugee Resettlement Processes [15.140146403589952]
Data-driven algorithmic matching to match refugees to locations using employment rate as a measure of utility. We develop a post-processing algorithm that, given placement decisions made by a default policy on a pool of refugees, solves an inversematching problem. Under these modified predictions, the optimal matching policy that maximizes predicted utility on the pool is guaranteed to be not harmful.
arXiv Detail & Related papers (2024-05-24T19:51:01Z)
Bayesian Safe Policy Learning with Chance Constrained Optimization: Application to Military Security Assessment during the Vietnam War [0.0]
We investigate whether it would have been possible to improve a security assessment algorithm employed during the Vietnam War. This empirical application raises several methodological challenges that frequently arise in high-stakes algorithmic decision-making.
arXiv Detail & Related papers (2023-07-17T20:59:50Z)
Enhancing Evacuation Planning through Multi-Agent Simulation and Artificial Intelligence: Understanding Human Behavior in Hazardous Environments [0.0]
The paper employs Artificial Intelligence (AI) techniques, specifically Multi-Agent Systems (MAS), to construct a simulation model for evacuation. The primary objective of this paper is to enhance our comprehension of how individuals react and respond during such distressing situations.
arXiv Detail & Related papers (2023-06-11T08:13:42Z)
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism [91.52263068880484]
We study offline Reinforcement Learning with Human Feedback (RLHF) We aim to learn the human's underlying reward and the MDP's optimal policy from a set of trajectories induced by human choices. RLHF is challenging for multiple reasons: large state space but limited human feedback, the bounded rationality of human decisions, and the off-policy distribution shift.
arXiv Detail & Related papers (2023-05-29T01:18:39Z)
Sequential Fair Resource Allocation under a Markov Decision Process Framework [9.440900386313213]
We study the sequential decision-making problem of allocating a limited resource to agents that reveal their demands on arrival over a finite horizon. We propose a new algorithm, SAFFE, that makes fair allocations with respect to the entire demands revealed over the horizon. We show that SAFFE leads to more fair and efficient allocations and achieves close-to-optimal performance in settings with dense arrivals.
arXiv Detail & Related papers (2023-01-10T02:34:00Z)
Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response. We construct unbiased estimators for the policy-dependent estimand by a perturbation method. We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z)
Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning [52.74071439183113]
We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) solved via reinforcement learning. Two significant computational challenges arise in applying decision-focused learning to MDPs.
arXiv Detail & Related papers (2021-06-06T23:53:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.