Prioritizing emergency evacuations under compounding levels of
uncertainty
- URL: http://arxiv.org/abs/2210.08975v1
- Date: Fri, 30 Sep 2022 21:01:05 GMT
- Title: Prioritizing emergency evacuations under compounding levels of
uncertainty
- Authors: Lisa J. Einstein, Robert J. Moss, Mykel J. Kochenderfer
- Abstract summary: We propose and analyze a decision support tool for pre-crisis training exercises for teams preparing for civilian evacuations.
We use different classes of Markov decision processes (MDPs) to capture compounding levels of uncertainty.
We show that accounting for the compounding levels of model uncertainty incurs added complexity without improvement in policy performance.
- Score: 34.71695000650056
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Well-executed emergency evacuations can save lives and reduce suffering.
However, decision makers struggle to determine optimal evacuation policies
given the chaos, uncertainty, and value judgments inherent in emergency
evacuations. We propose and analyze a decision support tool for pre-crisis
training exercises for teams preparing for civilian evacuations and explore the
tool in the case of the 2021 U.S.-led evacuation from Afghanistan. We use
different classes of Markov decision processes (MDPs) to capture compounding
levels of uncertainty in (1) the priority category of who appears next at the
gate for evacuation, (2) the distribution of priority categories at the
population level, and (3) individuals' claimed priority category. We compare
the number of people evacuated by priority status under eight heuristic
policies. The optimized MDP policy achieves the best performance compared to
all heuristic baselines. We also show that accounting for the compounding
levels of model uncertainty incurs added complexity without improvement in
policy performance. Useful heuristics can be extracted from the optimized
policies to inform human decision makers. We open-source all tools to encourage
robust dialogue about the trade-offs, limitations, and potential of integrating
algorithms into high-stakes humanitarian decision-making.
Related papers
- Optimal Transport-Assisted Risk-Sensitive Q-Learning [4.14360329494344]
This paper presents a risk-sensitive Q-learning algorithm that leverages optimal transport theory to enhance the agent safety.
We validate the proposed algorithm in a Gridworld environment.
arXiv Detail & Related papers (2024-06-17T17:32:25Z) - Matchings, Predictions and Counterfactual Harm in Refugee Resettlement Processes [15.140146403589952]
Data-driven algorithmic matching to match refugees to locations using employment rate as a measure of utility.
We develop a post-processing algorithm that, given placement decisions made by a default policy on a pool of refugees, solves an inversematching problem.
Under these modified predictions, the optimal matching policy that maximizes predicted utility on the pool is guaranteed to be not harmful.
arXiv Detail & Related papers (2024-05-24T19:51:01Z) - Bayesian Safe Policy Learning with Chance Constrained Optimization: Application to Military Security Assessment during the Vietnam War [0.0]
We investigate whether it would have been possible to improve a security assessment algorithm employed during the Vietnam War.
This empirical application raises several methodological challenges that frequently arise in high-stakes algorithmic decision-making.
arXiv Detail & Related papers (2023-07-17T20:59:50Z) - Enhancing Evacuation Planning through Multi-Agent Simulation and
Artificial Intelligence: Understanding Human Behavior in Hazardous
Environments [0.0]
The paper employs Artificial Intelligence (AI) techniques, specifically Multi-Agent Systems (MAS), to construct a simulation model for evacuation.
The primary objective of this paper is to enhance our comprehension of how individuals react and respond during such distressing situations.
arXiv Detail & Related papers (2023-06-11T08:13:42Z) - Reinforcement Learning with Human Feedback: Learning Dynamic Choices via
Pessimism [91.52263068880484]
We study offline Reinforcement Learning with Human Feedback (RLHF)
We aim to learn the human's underlying reward and the MDP's optimal policy from a set of trajectories induced by human choices.
RLHF is challenging for multiple reasons: large state space but limited human feedback, the bounded rationality of human decisions, and the off-policy distribution shift.
arXiv Detail & Related papers (2023-05-29T01:18:39Z) - Sequential Fair Resource Allocation under a Markov Decision Process
Framework [9.440900386313213]
We study the sequential decision-making problem of allocating a limited resource to agents that reveal their demands on arrival over a finite horizon.
We propose a new algorithm, SAFFE, that makes fair allocations with respect to the entire demands revealed over the horizon.
We show that SAFFE leads to more fair and efficient allocations and achieves close-to-optimal performance in settings with dense arrivals.
arXiv Detail & Related papers (2023-01-10T02:34:00Z) - Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response.
We construct unbiased estimators for the policy-dependent estimand by a perturbation method.
We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z) - Learning MDPs from Features: Predict-Then-Optimize for Sequential
Decision Problems by Reinforcement Learning [52.74071439183113]
We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) solved via reinforcement learning.
Two significant computational challenges arise in applying decision-focused learning to MDPs.
arXiv Detail & Related papers (2021-06-06T23:53:31Z) - Reliable Off-policy Evaluation for Reinforcement Learning [53.486680020852724]
In a sequential decision-making problem, off-policy evaluation estimates the expected cumulative reward of a target policy.
We propose a novel framework that provides robust and optimistic cumulative reward estimates using one or multiple logged data.
arXiv Detail & Related papers (2020-11-08T23:16:19Z) - Accelerating Deep Reinforcement Learning With the Aid of Partial Model:
Energy-Efficient Predictive Video Streaming [97.75330397207742]
Predictive power allocation is conceived for energy-efficient video streaming over mobile networks using deep reinforcement learning.
To handle the continuous state and action spaces, we resort to deep deterministic policy gradient (DDPG) algorithm.
Our simulation results show that the proposed policies converge to the optimal policy that is derived based on perfect large-scale channel prediction.
arXiv Detail & Related papers (2020-03-21T17:36:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.