Multi-Agent Path Finding with Prioritized Communication Learning
- URL: http://arxiv.org/abs/2202.03634v2
- Date: Thu, 10 Feb 2022 06:12:18 GMT
- Title: Multi-Agent Path Finding with Prioritized Communication Learning
- Authors: Wenhao Li, Hongjun Chen, Bo Jin, Wenzhe Tan, Hongyuan Zha, Xiangfeng
Wang
- Abstract summary: We propose a PrIoritized COmmunication learning method (PICO), which incorporates the textitimplicit planning priorities into the communication topology.
PICO performs significantly better in large-scale MAPF tasks in success rates and collision rates than state-of-the-art learning-based planners.
- Score: 44.89255851944412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent pathfinding (MAPF) has been widely used to solve large-scale
real-world problems, e.g., automation warehouses. The learning-based, fully
decentralized framework has been introduced to alleviate real-time problems and
simultaneously pursue optimal planning policy. However, existing methods might
generate significantly more vertex conflicts (or collisions), which lead to a
low success rate or more makespan. In this paper, we propose a PrIoritized
COmmunication learning method (PICO), which incorporates the \textit{implicit}
planning priorities into the communication topology within the decentralized
multi-agent reinforcement learning framework. Assembling with the classic
coupled planners, the implicit priority learning module can be utilized to form
the dynamic communication topology, which also builds an effective
collision-avoiding mechanism. PICO performs significantly better in large-scale
MAPF tasks in success rates and collision rates than state-of-the-art
learning-based planners.
Related papers
- Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback [58.049113055986375]
We develop a single stage approach named Alignment with Integrated Human Feedback (AIHF) to train reward models and the policy.
The proposed approach admits a suite of efficient algorithms, which can easily reduce to, and leverage, popular alignment algorithms.
We demonstrate the efficiency of the proposed solutions with extensive experiments involving alignment problems in LLMs and robotic control problems in MuJoCo.
arXiv Detail & Related papers (2024-06-11T01:20:53Z) - Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding [18.06081009550052]
Multi-Agent Reinforcement Learning (MARL) based Multi-Agent Path Finding (MAPF) has recently gained attention due to its efficiency and scalability.
Several MARL-MAPF methods choose to use communication to enrich the information one agent can perceive.
We propose a new method, Ensembling Prioritized Hybrid Policies (EPH)
arXiv Detail & Related papers (2024-03-12T11:47:12Z) - HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent
Pathfinding [16.36594480478895]
Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
arXiv Detail & Related papers (2024-02-23T13:01:13Z) - Imitation Learning based Alternative Multi-Agent Proximal Policy
Optimization for Well-Formed Swarm-Oriented Pursuit Avoidance [15.498559530889839]
In this paper, we put forward a decentralized learning based Alternative Multi-Agent Proximal Policy Optimization (IA-MAPPO) algorithm to execute the pursuit avoidance task in well-formed swarm.
We utilize imitation learning to decentralize the formation controller, so as to reduce the communication overheads and enhance the scalability.
The simulation results validate the effectiveness of IA-MAPPO and extensive ablation experiments further show the performance comparable to a centralized solution with significant decrease in communication overheads.
arXiv Detail & Related papers (2023-11-06T06:58:16Z) - Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment [12.122881147337505]
We propose a novel centralized training with decentralized execution method based on multi-agent reinforcement learning.
In our approach, agents communicate only with the centralized planner to make decentralized decisions online.
We conduct multi-step value convergence in multi-agent reinforcement learning to enhance the training efficiency.
arXiv Detail & Related papers (2023-10-25T14:21:22Z) - Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via
Planning and Learning [46.354187895184154]
Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph.
In this work, we investigate the decentralized MAPF setting, when the central controller that posses all the information on the agents' locations and goals is absent.
We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones.
arXiv Detail & Related papers (2023-10-02T13:51:32Z) - Accelerating Federated Edge Learning via Optimized Probabilistic Device
Scheduling [57.271494741212166]
This paper formulates and solves the communication time minimization problem.
It is found that the optimized policy gradually turns its priority from suppressing the remaining communication rounds to reducing per-round latency as the training process evolves.
The effectiveness of the proposed scheme is demonstrated via a use case on collaborative 3D objective detection in autonomous driving.
arXiv Detail & Related papers (2021-07-24T11:39:17Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.