Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via
Planning and Learning
- URL: http://arxiv.org/abs/2310.01207v1
- Date: Mon, 2 Oct 2023 13:51:32 GMT
- Title: Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via
Planning and Learning
- Authors: Alexey Skrynnik, Anton Andreychuk, Maria Nesterova, Konstantin
Yakovlev, Aleksandr Panov
- Abstract summary: Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph.
In this work, we investigate the decentralized MAPF setting, when the central controller that posses all the information on the agents' locations and goals is absent.
We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones.
- Score: 46.354187895184154
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-agent Pathfinding (MAPF) problem generally asks to find a set of
conflict-free paths for a set of agents confined to a graph and is typically
solved in a centralized fashion.
Conversely, in this work, we investigate the decentralized MAPF setting, when
the central controller that posses all the information on the agents' locations
and goals is absent and the agents have to sequientially decide the actions on
their own without having access to a full state of the environment. We focus on
the practically important lifelong variant of MAPF, which involves continuously
assigning new goals to the agents upon arrival to the previous ones. To address
this complex problem, we propose a method that integrates two complementary
approaches: planning with heuristic search and reinforcement learning through
policy optimization. Planning is utilized to construct and re-plan individual
paths. We enhance our planning algorithm with a dedicated technique tailored to
avoid congestion and increase the throughput of the system. We employ
reinforcement learning to discover the collision avoidance policies that
effectively guide the agents along the paths. The policy is implemented as a
neural network and is effectively trained without any reward-shaping or
external guidance.
We evaluate our method on a wide range of setups comparing it to the
state-of-the-art solvers. The results show that our method consistently
outperforms the learnable competitors, showing higher throughput and better
ability to generalize to the maps that were unseen at the training stage.
Moreover our solver outperforms a rule-based one in terms of throughput and is
an order of magnitude faster than a state-of-the-art search-based solver.
Related papers
- Decentralized Monte Carlo Tree Search for Partially Observable
Multi-agent Pathfinding [49.730902939565986]
Multi-Agent Pathfinding problem involves finding a set of conflict-free paths for a group of agents confined to a graph.
In this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally.
We propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks.
arXiv Detail & Related papers (2023-12-26T06:57:22Z) - Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment [12.122881147337505]
We propose a novel centralized training with decentralized execution method based on multi-agent reinforcement learning.
In our approach, agents communicate only with the centralized planner to make decentralized decisions online.
We conduct multi-step value convergence in multi-agent reinforcement learning to enhance the training efficiency.
arXiv Detail & Related papers (2023-10-25T14:21:22Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent
Path Planning in Continuous Spaces [20.389416558418382]
We propose a novel concept of roadmaps called cooperative timed roadmaps (CTRMs)
CTRMs enable each agent to focus on its important locations around potential solution paths in a way that considers the behavior of other agents to avoid inter-agent collisions.
We developed a machine-learning approach that learns a generative model from a collection of relevant problem instances and plausible solutions.
arXiv Detail & Related papers (2022-01-24T05:43:59Z) - C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation.
We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states.
E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z) - Subdimensional Expansion Using Attention-Based Learning For Multi-Agent
Path Finding [9.2127262112464]
Multi-Agent Path Finding (MAPF) finds conflict-free paths for multiple agents from their respective start to goal locations.
We develop a novel multi-agent planner called LM* by integrating this learning-based single-agent planner with M*.
Our results show that for both "seen" and "unseen" maps, in comparison with M*, LM* has fewer conflicts to be resolved and thus, runs faster and enjoys higher success rates.
arXiv Detail & Related papers (2021-09-29T20:01:04Z) - Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable
Grid Environments [62.997667081978825]
We consider the problem of multi-agent navigation in partially observable grid environments.
We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these policies to reach their goals.
arXiv Detail & Related papers (2021-08-13T09:44:47Z) - Distributed Heuristic Multi-Agent Path Finding with Communication [7.854890646114447]
Multi-Agent Path Finding (MAPF) is essential to large-scale robotic systems.
Recent methods have applied reinforcement learning (RL) to learn decentralized polices in partially observable environments.
This paper combines communication with deep Q-learning to provide a novel learning based method for MAPF.
arXiv Detail & Related papers (2021-06-21T18:50:58Z) - Compilation-based Solvers for Multi-Agent Path Finding: a Survey,
Discussion, and Future Opportunities [7.766921168069532]
We show the lessons learned from past developments and current trends in the topic and discuss its wider impact.
Two major approaches to optimal MAPF solving include (1) dedicated search-based methods, which solve MAPF directly, and (2) compilation-based methods that reduce a MAPF instance to an instance in a different well established formalism.
arXiv Detail & Related papers (2021-04-23T20:13:12Z) - Model-based Reinforcement Learning for Decentralized Multiagent
Rendezvous [66.6895109554163]
Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans.
We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
arXiv Detail & Related papers (2020-03-15T19:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.