HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent
Pathfinding
- URL: http://arxiv.org/abs/2402.15546v1
- Date: Fri, 23 Feb 2024 13:01:13 GMT
- Title: HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent
Pathfinding
- Authors: Huijie Tang, Federico Berto, Zihan Ma, Chuanbo Hua, Kyuree Ahn,
Jinkyoo Park
- Abstract summary: Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
- Score: 16.36594480478895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale multi-agent pathfinding (MAPF) presents significant challenges in
several areas. As systems grow in complexity with a multitude of autonomous
agents operating simultaneously, efficient and collision-free coordination
becomes paramount. Traditional algorithms often fall short in scalability,
especially in intricate scenarios. Reinforcement Learning (RL) has shown
potential to address the intricacies of MAPF; however, it has also been shown
to struggle with scalability, demanding intricate implementation, lengthy
training, and often exhibiting unstable convergence, limiting its practical
application. In this paper, we introduce Heuristics-Informed Multi-Agent
Pathfinding (HiMAP), a novel scalable approach that employs imitation learning
with heuristic guidance in a decentralized manner. We train on small-scale
instances using a heuristic policy as a teacher that maps each single agent
observation information to an action probability distribution. During
pathfinding, we adopt several inference techniques to improve performance. With
a simple training scheme and implementation, HiMAP demonstrates competitive
results in terms of success rate and scalability in the field of
imitation-learning-only MAPF, showing the potential of imitation-learning-only
MAPF equipped with inference techniques.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale [46.35418789518417]
Multi-agent pathfinding is a challenging computational problem that typically requires to find collision-free paths for multiple agents in a shared environment.
We have created a foundation model for the MAPF problems called MAPF-GPT.
Using imitation learning, we have trained a policy on a set of sub-optimal expert trajectories that can generate actions in conditions of partial observability.
We show that MAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers on a diverse range of problem instances.
arXiv Detail & Related papers (2024-08-29T12:55:10Z) - Enabling Multi-Agent Transfer Reinforcement Learning via Scenario
Independent Representation [0.7366405857677227]
Multi-Agent Reinforcement Learning (MARL) algorithms are widely adopted in tackling complex tasks that require collaboration and competition among agents.
We introduce a novel framework that enables transfer learning for MARL through unifying various state spaces into fixed-size inputs.
We show significant enhancements in multi-agent learning performance using maneuvering skills learned from other scenarios compared to agents learning from scratch.
arXiv Detail & Related papers (2024-02-13T02:48:18Z) - Traj-MAE: Masked Autoencoders for Trajectory Prediction [69.7885837428344]
Trajectory prediction has been a crucial task in building a reliable autonomous driving system by anticipating possible dangers.
We propose an efficient masked autoencoder for trajectory prediction (Traj-MAE) that better represents the complicated behaviors of agents in the driving environment.
Our experimental results in both multi-agent and single-agent settings demonstrate that Traj-MAE achieves competitive results with state-of-the-art methods.
arXiv Detail & Related papers (2023-03-12T16:23:27Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - Multi-Agent Path Finding with Prioritized Communication Learning [44.89255851944412]
We propose a PrIoritized COmmunication learning method (PICO), which incorporates the textitimplicit planning priorities into the communication topology.
PICO performs significantly better in large-scale MAPF tasks in success rates and collision rates than state-of-the-art learning-based planners.
arXiv Detail & Related papers (2022-02-08T04:04:19Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z) - MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement
Learning in Mixed Dynamic Environments [30.407700996710023]
This paper proposes a decentralized partially observable multi-agent path planning with evolutionary reinforcement learning (MAPPER) method.
We decompose the long-range navigation task into many easier sub-tasks under the guidance of a global planner.
Our approach models dynamic obstacles' behavior with an image-based representation and trains a policy in mixed dynamic environments without homogeneity assumption.
arXiv Detail & Related papers (2020-07-30T20:14:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.