Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding
- URL: http://arxiv.org/abs/2410.21415v1
- Date: Mon, 28 Oct 2024 18:13:15 GMT
- Title: Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding
- Authors: He Jiang, Yutong Wang, Rishi Veerapaneni, Tanishq Duhan, Guillaume Sartoretti, Jiaoyang Li,
- Abstract summary: Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals.
Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations.
This work proposes an imitation-learning-based LMAPF solver that introduces a novel communication module and systematic single-step collision resolution and global guidance techniques.
- Score: 20.289593818360938
- License:
- Abstract: Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals, necessitating frequent re-planning to accommodate these dynamic changes. Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations. However, it is still challenging for them to match the performance of the best search-based algorithms, especially in large-scale settings. This work proposes an imitation-learning-based LMAPF solver that introduces a novel communication module and systematic single-step collision resolution and global guidance techniques. Our proposed solver, Scalable Imitation Learning for LMAPF (SILLM), inherits the fast reasoning speed of learning-based methods and the high solution quality of search-based methods with the help of modern GPUs. Across six large-scale maps with up to 10,000 agents and varying obstacle structures, SILLM surpasses the best learning- and search-based baselines, achieving average throughput improvements of 137.7% and 16.0%, respectively. Furthermore, SILLM also beats the winning solution of the 2023 League of Robot Runners, an international LMAPF competition sponsored by Amazon Robotics. Finally, we validated SILLM with 10 real robots and 100 virtual robots in a mockup warehouse environment.
Related papers
- Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.
However, they still struggle with problems requiring multi-step decision-making and environmental feedback.
We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z) - Reinforcement Learning for Long-Horizon Interactive LLM Agents [56.9860859585028]
Interactive digital agents (IDAs) leverage APIs of stateful digital environments to perform tasks in response to user requests.
We present a reinforcement learning (RL) approach that trains IDAs directly in their target environments.
We derive LOOP, a data- and memory-efficient variant of proximal policy optimization.
arXiv Detail & Related papers (2025-02-03T18:35:42Z) - Multi-Agent Motion Planning For Differential Drive Robots Through Stationary State Search [5.9176395108304805]
Multi-Agent Motion Planning (MAMP) finds various applications in fields such as traffic management, airport operations, and warehouse automation.
This paper introduces a three-level framework called MASS to address these challenges.
MASS combines MAPF-based methods with our proposed stationary state search planner to generate high-quality kinodynamically-feasible plans.
arXiv Detail & Related papers (2024-12-17T22:17:42Z) - MALT: Improving Reasoning with Multi-Agent LLM Training [64.13803241218886]
We present a first step toward "Multi-agent LLM training" (MALT) on reasoning problems.
Our approach employs a sequential multi-agent setup with heterogeneous LLMs assigned specialized roles.
We evaluate our approach across MATH, GSM8k, and CQA, where MALT on Llama 3.1 8B models achieves relative improvements of 14.14%, 7.12%, and 9.40% respectively.
arXiv Detail & Related papers (2024-12-02T19:30:36Z) - Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge [93.4434417387526]
We propose Open Vocabulary Mobile Manipulation as a key benchmark task for robotics.
We organized a NeurIPS 2023 competition featuring both simulation and real-world components to evaluate solutions to this task.
We detail the results and methodologies used, both in simulation and real-world settings.
arXiv Detail & Related papers (2024-07-09T15:15:01Z) - Scaling Lifelong Multi-Agent Path Finding to More Realistic Settings: Research Challenges and Opportunities [44.292720085661585]
We present our winning approach to the 2023 League of Robot Runners LMAPF competition.
The first challenge is to search for high-quality LMAPF solutions within a limited planning time.
The second challenge is to alleviate myopic congestion and the effect of behaviors in LMAPF algorithms.
The third challenge is to bridge the gaps between the LMAPF models used in the literature and real-world applications.
arXiv Detail & Related papers (2024-04-24T19:37:18Z) - Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time
Multi-Robot Cooperative Exploration [16.681164058779146]
We consider the problem of cooperative exploration where multiple robots need to cooperatively explore an unknown region as fast as possible.
Existing MARL-based methods adopt action-making steps as the metric for exploration efficiency by assuming all the agents are acting in a fully synchronous manner.
We propose an asynchronous MARL solution, Asynchronous Coordination Explorer (ACE), to tackle this real-world challenge.
arXiv Detail & Related papers (2023-01-09T14:53:38Z) - Graph-Based Multi-Robot Path Finding and Planning [3.4260993997836753]
Planning collision-free paths for multiple robots is important for real-world multi-robot systems.
Recent advances have resulted in MAPF algorithms that can compute collision-free paths for hundreds of robots.
arXiv Detail & Related papers (2022-06-22T18:47:00Z) - Reinforcement Learning for Branch-and-Bound Optimisation using
Retrospective Trajectories [72.15369769265398]
Machine learning has emerged as a promising paradigm for branching.
We propose retro branching; a simple yet effective approach to RL for branching.
We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables.
arXiv Detail & Related papers (2022-05-28T06:08:07Z) - Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and
Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area.
Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration.
This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z) - Compilation-based Solvers for Multi-Agent Path Finding: a Survey,
Discussion, and Future Opportunities [7.766921168069532]
We show the lessons learned from past developments and current trends in the topic and discuss its wider impact.
Two major approaches to optimal MAPF solving include (1) dedicated search-based methods, which solve MAPF directly, and (2) compilation-based methods that reduce a MAPF instance to an instance in a different well established formalism.
arXiv Detail & Related papers (2021-04-23T20:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.