Related papers: Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via Planning and Learning

Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via Planning and Learning

URL: http://arxiv.org/abs/2310.01207v1
Date: Mon, 2 Oct 2023 13:51:32 GMT
Title: Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via Planning and Learning
Authors: Alexey Skrynnik, Anton Andreychuk, Maria Nesterova, Konstantin Yakovlev, Aleksandr Panov
Abstract summary: Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph. In this work, we investigate the decentralized MAPF setting, when the central controller that posses all the information on the agents' locations and goals is absent. We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones.
Score: 46.354187895184154
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph and is typically solved in a centralized fashion. Conversely, in this work, we investigate the decentralized MAPF setting, when the central controller that posses all the information on the agents' locations and goals is absent and the agents have to sequientially decide the actions on their own without having access to a full state of the environment. We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones. To address this complex problem, we propose a method that integrates two complementary approaches: planning with heuristic search and reinforcement learning through policy optimization. Planning is utilized to construct and re-plan individual paths. We enhance our planning algorithm with a dedicated technique tailored to avoid congestion and increase the throughput of the system. We employ reinforcement learning to discover the collision avoidance policies that effectively guide the agents along the paths. The policy is implemented as a neural network and is effectively trained without any reward-shaping or external guidance. We evaluate our method on a wide range of setups comparing it to the state-of-the-art solvers. The results show that our method consistently outperforms the learnable competitors, showing higher throughput and better ability to generalize to the maps that were unseen at the training stage. Moreover our solver outperforms a rule-based one in terms of throughput and is an order of magnitude faster than a state-of-the-art search-based solver.

Related papers

RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks [17.17370365888357]
Multi-Agent Path Finding (MAPF) is crucial for applications ranging from aerial swarms to warehouse automation. We have developed the first centralized learning-based policy for MAPF problem called RAILGUN. By leveraging a CNN-based architecture, RAILGUN can generalize across different maps and handle any number of agents.
arXiv Detail & Related papers (2025-03-04T20:35:20Z)
Decentralized Monte Carlo Tree Search for Partially Observable Multi-agent Pathfinding [49.730902939565986]
Multi-Agent Pathfinding problem involves finding a set of conflict-free paths for a group of agents confined to a graph. In this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally. We propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks.
arXiv Detail & Related papers (2023-12-26T06:57:22Z)
Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment [12.122881147337505]
We propose a novel centralized training with decentralized execution method based on multi-agent reinforcement learning. In our approach, agents communicate only with the centralized planner to make decentralized decisions online. We conduct multi-step value convergence in multi-agent reinforcement learning to enhance the training efficiency.
arXiv Detail & Related papers (2023-10-25T14:21:22Z)
AI planning in the imagination: High-level planning on learned abstract search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training. We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z)
CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces [20.389416558418382]
We propose a novel concept of roadmaps called cooperative timed roadmaps (CTRMs) CTRMs enable each agent to focus on its important locations around potential solution paths in a way that considers the behavior of other agents to avoid inter-agent collisions. We developed a machine-learning approach that learns a generative model from a collection of relevant problem instances and plausible solutions.
arXiv Detail & Related papers (2022-01-24T05:43:59Z)
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation. We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states. E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z)
Subdimensional Expansion Using Attention-Based Learning For Multi-Agent Path Finding [9.2127262112464]
Multi-Agent Path Finding (MAPF) finds conflict-free paths for multiple agents from their respective start to goal locations. We develop a novel multi-agent planner called LM* by integrating this learning-based single-agent planner with M*. Our results show that for both "seen" and "unseen" maps, in comparison with M*, LM* has fewer conflicts to be resolved and thus, runs faster and enjoys higher success rates.
arXiv Detail & Related papers (2021-09-29T20:01:04Z)
Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable Grid Environments [62.997667081978825]
We consider the problem of multi-agent navigation in partially observable grid environments. We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these policies to reach their goals.
arXiv Detail & Related papers (2021-08-13T09:44:47Z)
Distributed Heuristic Multi-Agent Path Finding with Communication [7.854890646114447]
Multi-Agent Path Finding (MAPF) is essential to large-scale robotic systems. Recent methods have applied reinforcement learning (RL) to learn decentralized polices in partially observable environments. This paper combines communication with deep Q-learning to provide a novel learning based method for MAPF.
arXiv Detail & Related papers (2021-06-21T18:50:58Z)
Compilation-based Solvers for Multi-Agent Path Finding: a Survey, Discussion, and Future Opportunities [7.766921168069532]
We show the lessons learned from past developments and current trends in the topic and discuss its wider impact. Two major approaches to optimal MAPF solving include (1) dedicated search-based methods, which solve MAPF directly, and (2) compilation-based methods that reduce a MAPF instance to an instance in a different well established formalism.
arXiv Detail & Related papers (2021-04-23T20:13:12Z)
Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous [66.6895109554163]
Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans. We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
arXiv Detail & Related papers (2020-03-15T19:49:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.