Distributed Heuristic Multi-Agent Path Finding with Communication
- URL: http://arxiv.org/abs/2106.11365v1
- Date: Mon, 21 Jun 2021 18:50:58 GMT
- Title: Distributed Heuristic Multi-Agent Path Finding with Communication
- Authors: Ziyuan Ma, Yudong Luo, Hang Ma
- Abstract summary: Multi-Agent Path Finding (MAPF) is essential to large-scale robotic systems.
Recent methods have applied reinforcement learning (RL) to learn decentralized polices in partially observable environments.
This paper combines communication with deep Q-learning to provide a novel learning based method for MAPF.
- Score: 7.854890646114447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-Agent Path Finding (MAPF) is essential to large-scale robotic systems.
Recent methods have applied reinforcement learning (RL) to learn decentralized
polices in partially observable environments. A fundamental challenge of
obtaining collision-free policy is that agents need to learn cooperation to
handle congested situations. This paper combines communication with deep
Q-learning to provide a novel learning based method for MAPF, where agents
achieve cooperation via graph convolution. To guide RL algorithm on
long-horizon goal-oriented tasks, we embed the potential choices of shortest
paths from single source as heuristic guidance instead of using a specific path
as in most existing works. Our method treats each agent independently and
trains the model from a single agent's perspective. The final trained policy is
applied to each agent for decentralized execution. The whole system is
distributed during training and is trained under a curriculum learning
strategy. Empirical evaluation in obstacle-rich environment indicates the high
success rate with low average step of our method.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Multi-Agent Target Assignment and Path Finding for Intelligent Warehouse: A Cooperative Multi-Agent Deep Reinforcement Learning Perspective [6.148164795916424]
Multi-agent target assignment and path planning (TAPF) are two key problems in intelligent warehouse.
We propose a method to simultaneously solve target assignment and path planning from a perspective of cooperative multi-agent deep reinforcement learning (RL)
Experimental results show that our method performs well in various task settings.
arXiv Detail & Related papers (2024-08-25T07:32:58Z) - Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via
Planning and Learning [46.354187895184154]
Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph.
In this work, we investigate the decentralized MAPF setting, when the central controller that posses all the information on the agents' locations and goals is absent.
We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones.
arXiv Detail & Related papers (2023-10-02T13:51:32Z) - MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning.
We propose MADiff, a novel generative multi-agent learning framework to tackle this problem.
Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z) - Learning From Good Trajectories in Offline Multi-Agent Reinforcement
Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets.
One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team.
We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z) - Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments.
We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z) - Relative Distributed Formation and Obstacle Avoidance with Multi-agent
Reinforcement Learning [20.401609420707867]
We propose a distributed formation and obstacle avoidance method based on multi-agent reinforcement learning (MARL)
Our method achieves better performance regarding formation error, formation convergence rate and on-par success rate of obstacle avoidance compared with baselines.
arXiv Detail & Related papers (2021-11-14T13:02:45Z) - Decentralized Cooperative Multi-Agent Reinforcement Learning with
Exploration [35.75029940279768]
We study multi-agent reinforcement learning in the most basic cooperative setting -- Markov teams.
We propose an algorithm in which each agent independently runs a stage-based V-learning style algorithm.
We show that the agents can learn an $epsilon$-approximate Nash equilibrium policy in at most $proptowidetildeO (1/epsilon4)$ episodes.
arXiv Detail & Related papers (2021-10-12T02:45:12Z) - Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML.
We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML.
Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.