Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding
- URL: http://arxiv.org/abs/2410.21415v1
- Date: Mon, 28 Oct 2024 18:13:15 GMT
- Title: Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding
- Authors: He Jiang, Yutong Wang, Rishi Veerapaneni, Tanishq Duhan, Guillaume Sartoretti, Jiaoyang Li,
- Abstract summary: Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals.
Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations.
This work proposes an imitation-learning-based LMAPF solver that introduces a novel communication module and systematic single-step collision resolution and global guidance techniques.
- Score: 20.289593818360938
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals, necessitating frequent re-planning to accommodate these dynamic changes. Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations. However, it is still challenging for them to match the performance of the best search-based algorithms, especially in large-scale settings. This work proposes an imitation-learning-based LMAPF solver that introduces a novel communication module and systematic single-step collision resolution and global guidance techniques. Our proposed solver, Scalable Imitation Learning for LMAPF (SILLM), inherits the fast reasoning speed of learning-based methods and the high solution quality of search-based methods with the help of modern GPUs. Across six large-scale maps with up to 10,000 agents and varying obstacle structures, SILLM surpasses the best learning- and search-based baselines, achieving average throughput improvements of 137.7% and 16.0%, respectively. Furthermore, SILLM also beats the winning solution of the 2023 League of Robot Runners, an international LMAPF competition sponsored by Amazon Robotics. Finally, we validated SILLM with 10 real robots and 100 virtual robots in a mockup warehouse environment.
Related papers
- LOP: Learning Optimal Pruning for Efficient On-Demand MLLMs Scaling [52.1366057696919]
LOP is an efficient neural pruning framework that learns optimal pruning strategies from the target pruning constraint.<n>LOP approach trains autoregressive neural networks (NNs) to directly predict layer-wise pruning strategies adaptive to the target pruning constraint.<n> Experimental results show that LOP outperforms state-of-the-art pruning methods in various metrics while achieving up to three orders of magnitude speedup.
arXiv Detail & Related papers (2025-06-15T12:14:16Z) - Where Paths Collide: A Comprehensive Survey of Classic and Learning-Based Multi-Agent Pathfinding [19.93293239540926]
Multi-Agent Path Finding (MAPF) is a fundamental problem in artificial intelligence and robotics.<n>This survey bridges the long-standing divide between classical algorithmic approaches and emerging learning-based methods in MAPF research.
arXiv Detail & Related papers (2025-05-25T16:28:06Z) - RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks [17.17370365888357]
Multi-Agent Path Finding (MAPF) is crucial for applications ranging from aerial swarms to warehouse automation.
We have developed the first centralized learning-based policy for MAPF problem called RAILGUN.
By leveraging a CNN-based architecture, RAILGUN can generalize across different maps and handle any number of agents.
arXiv Detail & Related papers (2025-03-04T20:35:20Z) - Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.
However, they still struggle with problems requiring multi-step decision-making and environmental feedback.
We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z) - Reinforcement Learning for Long-Horizon Interactive LLM Agents [56.9860859585028]
Interactive digital agents (IDAs) leverage APIs of stateful digital environments to perform tasks in response to user requests.
We present a reinforcement learning (RL) approach that trains IDAs directly in their target environments.
We derive LOOP, a data- and memory-efficient variant of proximal policy optimization.
arXiv Detail & Related papers (2025-02-03T18:35:42Z) - Multi-Agent Motion Planning For Differential Drive Robots Through Stationary State Search [5.9176395108304805]
Multi-Agent Motion Planning (MAMP) finds various applications in fields such as traffic management, airport operations, and warehouse automation.
This paper introduces a three-level framework called MASS to address these challenges.
MASS combines MAPF-based methods with our proposed stationary state search planner to generate high-quality kinodynamically-feasible plans.
arXiv Detail & Related papers (2024-12-17T22:17:42Z) - FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning [74.25049012472502]
FLaRe is a large-scale Reinforcement Learning framework that integrates robust pre-trained representations, large-scale training, and gradient stabilization techniques.
Our method aligns pre-trained policies towards task completion, achieving state-of-the-art (SoTA) performance on previously demonstrated and on entirely novel tasks and embodiments.
arXiv Detail & Related papers (2024-09-25T03:15:17Z) - Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge [93.4434417387526]
We propose Open Vocabulary Mobile Manipulation as a key benchmark task for robotics.
We organized a NeurIPS 2023 competition featuring both simulation and real-world components to evaluate solutions to this task.
We detail the results and methodologies used, both in simulation and real-world settings.
arXiv Detail & Related papers (2024-07-09T15:15:01Z) - LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning [91.95362946266577]
Path planning is a fundamental scientific problem in robotics and autonomous navigation.<n>Traditional algorithms like A* and its variants are capable of ensuring path validity but suffer from significant computational and memory inefficiencies as the state space grows.<n>We propose a new LLM based route planning method that synergistically combines the precise pathfinding capabilities of A* with the global reasoning capability of LLMs.<n>This hybrid approach aims to enhance pathfinding efficiency in terms of time and space complexity while maintaining the integrity of path validity, especially in large-scale scenarios.
arXiv Detail & Related papers (2024-06-20T01:24:30Z) - MGCBS: An Optimal and Efficient Algorithm for Solving Multi-Goal Multi-Agent Path Finding Problem [5.580214316179672]
The MG-MAPF problem requires each agent to visit pre-assigned multiple goal points at least once without conflict.
We propose the Multi-Goal Conflict-Based Search (MGCBS), which is based on Decoupling the goal Safe interval visiting order search and the Single-agent pathfinding.
Our method can consistently obtain optimal results and execute up to 7 times faster than the state-of-the-art method in our evaluation.
arXiv Detail & Related papers (2024-04-30T12:49:54Z) - Scaling Lifelong Multi-Agent Path Finding to More Realistic Settings: Research Challenges and Opportunities [44.292720085661585]
We present our winning approach to the 2023 League of Robot Runners LMAPF competition.
The first challenge is to search for high-quality LMAPF solutions within a limited planning time.
The second challenge is to alleviate myopic congestion and the effect of behaviors in LMAPF algorithms.
The third challenge is to bridge the gaps between the LMAPF models used in the literature and real-world applications.
arXiv Detail & Related papers (2024-04-24T19:37:18Z) - Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large
Neighborhood Search [30.364955687049292]
State-of-the-art anytime MAPF is based on Large Neighborhood Search (LNS)
We propose Bandit-based Adaptive LArge Neighborhood search Combined with Exploration (BALANCE)
We empirically demonstrate cost improvements of at least 50% compared to state-of-the-art anytime MAPF in large-scale scenarios.
arXiv Detail & Related papers (2023-12-28T01:24:42Z) - Multi-Robot Path Planning Combining Heuristics and Multi-Agent
Reinforcement Learning [0.0]
In the movement process, robots need to avoid collisions with other moving robots while minimizing their travel distance.
Previous methods for this problem either continuously replan paths using search methods to avoid conflicts or choose appropriate collision avoidance strategies based on learning approaches.
We propose a path planning method, MAPPOHR, which combines a search, empirical rules, and multi-agent reinforcement learning.
arXiv Detail & Related papers (2023-06-02T05:07:37Z) - Train Hard, Fight Easy: Robust Meta Reinforcement Learning [78.16589993684698]
A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients.
Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty.
In this work, we define a robust MRL objective with a controlled level.
The data inefficiency is addressed via the novel Robust Meta RL algorithm (RoML)
arXiv Detail & Related papers (2023-01-26T14:54:39Z) - Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time
Multi-Robot Cooperative Exploration [16.681164058779146]
We consider the problem of cooperative exploration where multiple robots need to cooperatively explore an unknown region as fast as possible.
Existing MARL-based methods adopt action-making steps as the metric for exploration efficiency by assuming all the agents are acting in a fully synchronous manner.
We propose an asynchronous MARL solution, Asynchronous Coordination Explorer (ACE), to tackle this real-world challenge.
arXiv Detail & Related papers (2023-01-09T14:53:38Z) - Graph-Based Multi-Robot Path Finding and Planning [3.4260993997836753]
Planning collision-free paths for multiple robots is important for real-world multi-robot systems.
Recent advances have resulted in MAPF algorithms that can compute collision-free paths for hundreds of robots.
arXiv Detail & Related papers (2022-06-22T18:47:00Z) - Reinforcement Learning for Branch-and-Bound Optimisation using
Retrospective Trajectories [72.15369769265398]
Machine learning has emerged as a promising paradigm for branching.
We propose retro branching; a simple yet effective approach to RL for branching.
We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables.
arXiv Detail & Related papers (2022-05-28T06:08:07Z) - Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and
Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area.
Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration.
This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z) - Compilation-based Solvers for Multi-Agent Path Finding: a Survey,
Discussion, and Future Opportunities [7.766921168069532]
We show the lessons learned from past developments and current trends in the topic and discuss its wider impact.
Two major approaches to optimal MAPF solving include (1) dedicated search-based methods, which solve MAPF directly, and (2) compilation-based methods that reduce a MAPF instance to an instance in a different well established formalism.
arXiv Detail & Related papers (2021-04-23T20:13:12Z) - POMP: Pomcp-based Online Motion Planning for active visual search in
indoor environments [89.43830036483901]
We focus on the problem of learning an optimal policy for Active Visual Search (AVS) of objects in known indoor environments with an online setup.
Our POMP method uses as input the current pose of an agent and a RGB-D frame.
We validate our method on the publicly available AVD benchmark, achieving an average success rate of 0.76 with an average path length of 17.1.
arXiv Detail & Related papers (2020-09-17T08:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.