Related papers: Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing

Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing

URL: http://arxiv.org/abs/2405.13205v2
Date: Sat, 8 Jun 2024 18:08:09 GMT
Title: Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing
Authors: Amutheezan Sivagnanam, Ava Pettet, Hunter Lee, Ayan Mukhopadhyay, Abhishek Dubey, Aron Laszka,
Abstract summary: An emergency responder management (ERM) system dispatches responders when it receives requests for medical aid. ERM systems can proactively reposition responders between predesignated waiting locations to cover any gaps. The state-of-the-art approach in proactive repositioning is a hierarchical approach based on spatial decomposition and online Monte Carlo tree search. We introduce a novel reinforcement learning (RL) approach, based on the same hierarchical decomposition, but replacing online search with learning.
Score: 8.293120269016834
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: An emergency responder management (ERM) system dispatches responders, such as ambulances, when it receives requests for medical aid. ERM systems can also proactively reposition responders between predesignated waiting locations to cover any gaps that arise due to the prior dispatch of responders or significant changes in the distribution of anticipated requests. Optimal repositioning is computationally challenging due to the exponential number of ways to allocate responders between locations and the uncertainty in future requests. The state-of-the-art approach in proactive repositioning is a hierarchical approach based on spatial decomposition and online Monte Carlo tree search, which may require minutes of computation for each decision in a domain where seconds can save lives. We address the issue of long decision times by introducing a novel reinforcement learning (RL) approach, based on the same hierarchical decomposition, but replacing online search with learning. To address the computational challenges posed by large, variable-dimensional, and discrete state and action spaces, we propose: (1) actor-critic based agents that incorporate transformers to handle variable-dimensional states and actions, (2) projections to fixed-dimensional observations to handle complex states, and (3) combinatorial techniques to map continuous actions to discrete allocations. We evaluate our approach using real-world data from two U.S. cities, Nashville, TN and Seattle, WA. Our experiments show that compared to the state of the art, our approach reduces computation time per decision by three orders of magnitude, while also slightly reducing average ambulance response time by 5 seconds.

Related papers

Emergency Response Inference Mapping (ERIMap): A Bayesian network-based method for dynamic observation processing [1.0998375857698495]
In emergencies, high stake decisions often have to be made under time pressure and strain. Currently, there is a lack of systematic approaches for information processing and situation assessment. We present a Bayesian network-based method called ERIMap that is tailored to the complex information-scape during emergencies.
arXiv Detail & Related papers (2024-03-11T13:40:46Z)
Auxiliary Tasks Benefit 3D Skeleton-based Human Motion Prediction [106.06256351200068]
This paper introduces a model learning framework with auxiliary tasks. In our auxiliary tasks, partial body joints' coordinates are corrupted by either masking or adding noise. We propose a novel auxiliary-adapted transformer, which can handle incomplete, corrupted motion data.
arXiv Detail & Related papers (2023-08-17T12:26:11Z)
Implicit neural representation for change detection [15.741202788959075]
Most commonly used approaches to detecting changes in point clouds are based on supervised methods. We propose an unsupervised approach that comprises two components: Implicit Neural Representation (INR) for continuous shape reconstruction and a Gaussian Mixture Model for categorising changes. We apply our method to a benchmark dataset comprising simulated LiDAR point clouds for urban sprawling.
arXiv Detail & Related papers (2023-07-28T09:26:00Z)
FormerTime: Hierarchical Multi-Scale Representations for Multivariate Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task. It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z)
NeurAR: Neural Uncertainty for Autonomous 3D Reconstruction [64.36535692191343]
Implicit neural representations have shown compelling results in offline 3D reconstruction and also recently demonstrated the potential for online SLAM systems. This paper addresses two key challenges: 1) seeking a criterion to measure the quality of the candidate viewpoints for the view planning based on the new representations, and 2) learning the criterion from data that can generalize to different scenes instead of hand-crafting one. Our method demonstrates significant improvements on various metrics for the rendered image quality and the geometry quality of the reconstructed 3D models when compared with variants using TSDF or reconstruction without view planning.
arXiv Detail & Related papers (2022-07-22T10:05:36Z)
Action Quality Assessment with Temporal Parsing Transformer [84.1272079121699]
Action Quality Assessment (AQA) is important for action understanding and resolving the task poses unique challenges due to subtle visual differences. We propose a temporal parsing transformer to decompose the holistic feature into temporal part-level representations. Our proposed method outperforms prior work on three public AQA benchmarks by a considerable margin.
arXiv Detail & Related papers (2022-07-19T13:29:05Z)
Continuous-Time and Multi-Level Graph Representation Learning for Origin-Destination Demand Prediction [52.0977259978343]
This paper proposes a Continuous-time and Multi-level dynamic graph representation learning method for Origin-Destination demand prediction (CMOD) The state vectors keep historical transaction information and are continuously updated according to the most recently happened transactions. Experiments are conducted on two real-world datasets from Beijing Subway and New York Taxi, and the results demonstrate the superiority of our model against the state-of-the-art approaches.
arXiv Detail & Related papers (2022-06-30T03:37:50Z)
Dynamic Graph Learning Based on Hierarchical Memory for Origin-Destination Demand Prediction [12.72319550363076]
This paper provides a dynamic graph representation learning framework for OD demands prediction. In particular, a hierarchical memory updater is first proposed to maintain a time-aware representation for each node. Atemporal propagation mechanism is provided to aggregate representations of neighbor nodes along a randomtemporal route. An objective function is designed to derive the future OD demands according to the most recent node.
arXiv Detail & Related papers (2022-05-29T07:52:35Z)
An Online Approach to Solve the Dynamic Vehicle Routing Problem with Stochastic Trip Requests for Paratransit Services [5.649212162857776]
We propose a fully online approach to solve the dynamic vehicle routing problem (DVRP) It is difficult to batch paratransit requests together as they are temporally sparse. We use Monte Carlo tree search to evaluate actions for any given state.
arXiv Detail & Related papers (2022-03-28T22:15:52Z)
Averaging Spatio-temporal Signals using Optimal Transport and Soft Alignments [110.79706180350507]
We show that our proposed loss can be used to define temporal-temporal baryechecenters as Fr'teche means duality. Experiments on handwritten letters and brain imaging data confirm our theoretical findings.
arXiv Detail & Related papers (2022-03-11T09:46:22Z)
H-TD2: Hybrid Temporal Difference Learning for Adaptive Urban Taxi Dispatch [9.35511513240868]
H-TD2 is a model-free, adaptive decision-making algorithm to coordinate a large fleet of automated taxis in a dynamic urban environment. We derive a regret bound and design the trigger condition between the two behaviors to explicitly control the trade-off between computational complexity and the individual taxi policy's bounded sub-optimality. Unlike recent reinforcement learning dispatch methods, this policy estimation is adaptive and robust to out-of-training domain events.
arXiv Detail & Related papers (2021-05-05T15:42:31Z)
On Algorithmic Decision Procedures in Emergency Response Systems in Smart and Connected Communities [21.22596396400625]
Emergency Response Management (ERM) is a critical problem faced by communities across the globe. We argue that the crucial period of planning for ERM systems is not post-incident, but between incidents. We propose two partially decentralized multi-agent planning algorithms that utilizes and exploit the structure of the dispatch problem.
arXiv Detail & Related papers (2020-01-21T07:04:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.