H2-MARL: Multi-Agent Reinforcement Learning for Pareto Optimality in Hospital Capacity Strain and Human Mobility during Epidemic
- URL: http://arxiv.org/abs/2503.10907v1
- Date: Thu, 13 Mar 2025 21:40:07 GMT
- Title: H2-MARL: Multi-Agent Reinforcement Learning for Pareto Optimality in Hospital Capacity Strain and Human Mobility during Epidemic
- Authors: Xueting Luo, Hao Deng, Jihong Yang, Yao Shen, Huanhuan Guo, Zhiyuan Sun, Mingqing Liu, Jiming Wei, Shengjie Zhao,
- Abstract summary: We develop a township-level infection model with online-updatable parameters to simulate disease transmission.<n>We construct a township-level human mobility dataset containing over one billion records from four representative cities of varying scales.<n>Experiments demonstrate that H2-MARL has the optimal dual-objective trade-off capability, which can minimize hospital capacity strain while minimizing human mobility restriction loss.
- Score: 10.359719487924108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The necessity of achieving an effective balance between minimizing the losses associated with restricting human mobility and ensuring hospital capacity has gained significant attention in the aftermath of COVID-19. Reinforcement learning (RL)-based strategies for human mobility management have recently advanced in addressing the dynamic evolution of cities and epidemics; however, they still face challenges in achieving coordinated control at the township level and adapting to cities of varying scales. To address the above issues, we propose a multi-agent RL approach that achieves Pareto optimality in managing hospital capacity and human mobility (H2-MARL), applicable across cities of different scales. We first develop a township-level infection model with online-updatable parameters to simulate disease transmission and construct a city-wide dynamic spatiotemporal epidemic simulator. On this basis, H2-MARL is designed to treat each division as an agent, with a trade-off dual-objective reward function formulated and an experience replay buffer enriched with expert knowledge built. To evaluate the effectiveness of the model, we construct a township-level human mobility dataset containing over one billion records from four representative cities of varying scales. Extensive experiments demonstrate that H2-MARL has the optimal dual-objective trade-off capability, which can minimize hospital capacity strain while minimizing human mobility restriction loss. Meanwhile, the applicability of the proposed model to epidemic control in cities of varying scales is verified, which showcases its feasibility and versatility in practical applications.
Related papers
- Towards Autonomous Micromobility through Scalable Urban Simulation [52.749987132021324]
Current micromobility depends mostly on human manual operation (in-person or remote control)
In this work, we present a scalable urban simulation solution to advance autonomous micromobility.
arXiv Detail & Related papers (2025-05-01T17:52:29Z) - Collaborative Imputation of Urban Time Series through Cross-city Meta-learning [54.438991949772145]
We propose a novel collaborative imputation paradigm leveraging meta-learned implicit neural representations (INRs)<n>We then introduce a cross-city collaborative learning scheme through model-agnostic meta learning.<n>Experiments on a diverse urban dataset from 20 global cities demonstrate our model's superior imputation performance and generalizability.
arXiv Detail & Related papers (2025-01-20T07:12:40Z) - ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Prediction [6.0588503913405045]
We propose a robust approach to predict human mobility patterns called ST-MoE-BERT.
Our methodology integrates the Mixture-of-Experts architecture with BERT model to capture complex mobility dynamics.
We demonstrate the effectiveness of the proposed model on GEO-BLEU and DTW, comparing it to several state-of-the-art methods.
arXiv Detail & Related papers (2024-10-18T00:32:18Z) - COLA: Cross-city Mobility Transformer for Human Trajectory Simulation [44.157114416533915]
We develop a Cross-city mObiLity trAnsformer (COLA) with a dedicated model-agnostic transfer framework.
COLA divides the Transformer into the private modules for city-specific characteristics and the shared modules for city-universal mobility patterns.
Our implemented cross-city baselines have demonstrated its superiority and effectiveness.
arXiv Detail & Related papers (2024-03-04T07:45:29Z) - One-stop Training of Multiple Capacity Models [74.87789190840527]
We propose a novel one-stop training framework to jointly train high-capacity and low-capactiy models.
Unlike knowledge distillation, where multiple capacity models are trained from scratch separately, our approach integrates supervisions from different capacity models simultaneously.
arXiv Detail & Related papers (2023-05-23T13:44:09Z) - STORM-GAN: Spatio-Temporal Meta-GAN for Cross-City Estimation of Human
Mobility Responses to COVID-19 [17.611056163940404]
We make the first attempt to tackle the cross-city human mobility estimation problem through a deep meta-generative framework.
We propose a S-Temporal Meta-Generative Adrial Network (STORM-GAN) model that estimates dynamic human mobility responses.
We show that the proposed approach can greatly improve estimation performance and out-perform baselines.
arXiv Detail & Related papers (2023-01-20T15:55:41Z) - Continuous Trajectory Generation Based on Two-Stage GAN [50.55181727145379]
We propose a novel two-stage generative adversarial framework to generate the continuous trajectory on the road network.
Specifically, we build the generator under the human mobility hypothesis of the A* algorithm to learn the human mobility behavior.
For the discriminator, we combine the sequential reward with the mobility yaw reward to enhance the effectiveness of the generator.
arXiv Detail & Related papers (2023-01-16T09:54:02Z) - Safety-compliant Generative Adversarial Networks for Human Trajectory
Forecasting [95.82600221180415]
Human forecasting in crowds presents the challenges of modelling social interactions and outputting collision-free multimodal distribution.
We introduce SGANv2, an improved safety-compliant SGAN architecture equipped with motion-temporal interaction modelling and a transformer-based discriminator design.
arXiv Detail & Related papers (2022-09-25T15:18:56Z) - Efficient Multimodal Transformer with Dual-Level Feature Restoration for
Robust Multimodal Sentiment Analysis [47.29528724322795]
Multimodal Sentiment Analysis (MSA) has attracted increasing attention recently.
Despite significant progress, there are still two major challenges on the way towards robust MSA.
We propose a generic and unified framework to address them, named Efficient Multimodal Transformer with Dual-Level Feature Restoration (EMT-DLFR)
arXiv Detail & Related papers (2022-08-16T08:02:30Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Policy-Aware Mobility Model Explains the Growth of COVID-19 in Cities [36.94575237918065]
Predictions must take into account non-pharmaceutical interventions to slow the spread of coronavirus.
We show that by incorporating intra-city mobility and policy adoption into a novel metapopulation SEIR model, we can accurately predict complex COVID-19 growth patterns in U.S. cities.
arXiv Detail & Related papers (2021-02-21T07:39:17Z) - Reinforced Contact Tracing and Epidemic Intervention [8.141401074784406]
We develop an Individual-based Reinforcement Learning Epidemic Control Agent (IDRLECA) to search for smart epidemic control strategies.
IDRLECA can suppress infections at a very low level and retain more than 95% of human mobility.
arXiv Detail & Related papers (2021-02-04T08:31:48Z) - Reinforced Epidemic Control: Saving Both Lives and Economy [14.008719195238774]
We propose a solution for the life-or-economy dilemma that does not require private data.
We bypass the private-data requirement by suppressing epidemic transmission through a dynamic control on inter-regional mobility.
We develop DUal-objective Reinforcement-Learning Epidemic Control Agent (DURLECA) to search mobility-control policies.
arXiv Detail & Related papers (2020-08-04T00:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.