Multi-Robot Path Planning Combining Heuristics and Multi-Agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2306.01270v1
- Date: Fri, 2 Jun 2023 05:07:37 GMT
- Title: Multi-Robot Path Planning Combining Heuristics and Multi-Agent
Reinforcement Learning
- Authors: Shaoming Peng
- Abstract summary: In the movement process, robots need to avoid collisions with other moving robots while minimizing their travel distance.
Previous methods for this problem either continuously replan paths using search methods to avoid conflicts or choose appropriate collision avoidance strategies based on learning approaches.
We propose a path planning method, MAPPOHR, which combines a search, empirical rules, and multi-agent reinforcement learning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multi-robot path finding in dynamic environments is a highly challenging
classic problem. In the movement process, robots need to avoid collisions with
other moving robots while minimizing their travel distance. Previous methods
for this problem either continuously replan paths using heuristic search
methods to avoid conflicts or choose appropriate collision avoidance strategies
based on learning approaches. The former may result in long travel distances
due to frequent replanning, while the latter may have low learning efficiency
due to low sample exploration and utilization, and causing high training costs
for the model. To address these issues, we propose a path planning method,
MAPPOHR, which combines heuristic search, empirical rules, and multi-agent
reinforcement learning. The method consists of two layers: a real-time planner
based on the multi-agent reinforcement learning algorithm, MAPPO, which embeds
empirical rules in the action output layer and reward functions, and a
heuristic search planner used to create a global guiding path. During movement,
the heuristic search planner replans new paths based on the instructions of the
real-time planner. We tested our method in 10 different conflict scenarios. The
experiments show that the planning performance of MAPPOHR is better than that
of existing learning and heuristic methods. Due to the utilization of empirical
knowledge and heuristic search, the learning efficiency of MAPPOHR is higher
than that of existing learning methods.
Related papers
- LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning [91.95362946266577]
Path planning is a fundamental scientific problem in robotics and autonomous navigation.
Traditional algorithms like A* and its variants are capable of ensuring path validity but suffer from significant computational and memory inefficiencies as the state space grows.
We propose a new LLM based route planning method that synergistically combines the precise pathfinding capabilities of A* with the global reasoning capability of LLMs.
This hybrid approach aims to enhance pathfinding efficiency in terms of time and space complexity while maintaining the integrity of path validity, especially in large-scale scenarios.
arXiv Detail & Related papers (2024-06-20T01:24:30Z) - Diffusion-Reinforcement Learning Hierarchical Motion Planning in Adversarial Multi-agent Games [6.532258098619471]
We focus on a motion planning task for an evasive target in a partially observable multi-agent adversarial pursuit-evasion games (PEG)
These pursuit-evasion problems are relevant to various applications, such as search and rescue operations and surveillance robots.
We propose a hierarchical architecture that integrates a high-level diffusion model to plan global paths responsive to environment data.
arXiv Detail & Related papers (2024-03-16T03:53:55Z) - Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents [49.85633804913796]
We present an exploration-based trajectory optimization approach, referred to as ETO.
This learning method is designed to enhance the performance of open LLM agents.
Our experiments on three complex tasks demonstrate that ETO consistently surpasses baseline performance by a large margin.
arXiv Detail & Related papers (2024-03-04T21:50:29Z) - Evolutionary Swarm Robotics: Dynamic Subgoal-Based Path Formation and
Task Allocation for Exploration and Navigation in Unknown Environments [0.0]
The paper presents a method called the sub-goal-based path formation, which establishes a path between two different locations by exploiting visually connected sub-goals.
The paper tackles the problem of inter-collision (traffic) among a large number of robots engaged in path formation, which negatively impacts the performance of the sub-goal-based method.
A task allocation strategy is proposed, leveraging local communication protocols and light signal-based communication.
arXiv Detail & Related papers (2023-12-27T15:13:56Z) - Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via
Planning and Learning [46.354187895184154]
Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph.
In this work, we investigate the decentralized MAPF setting, when the central controller that posses all the information on the agents' locations and goals is absent.
We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones.
arXiv Detail & Related papers (2023-10-02T13:51:32Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - E2R: a Hierarchical-Learning inspired Novelty-Search method to generate diverse repertoires of grasping trajectories [0.0]
We introduce a new NS-based method that can generate large datasets of grasping trajectories in a platform-agnostic manner.
Inspired by the hierarchical learning paradigm, our method decouples approach and prehension to make the behavioral space smoother.
Experiments conducted on 3 different robot-gripper setups and on several standard objects shows that our method outperforms state-of-the-art.
arXiv Detail & Related papers (2022-10-14T15:13:10Z) - Meta Navigator: Search for a Good Adaptation Policy for Few-shot
Learning [113.05118113697111]
Few-shot learning aims to adapt knowledge learned from previous tasks to novel tasks with only a limited amount of labeled data.
Research literature on few-shot learning exhibits great diversity, while different algorithms often excel at different few-shot learning scenarios.
We present Meta Navigator, a framework that attempts to solve the limitation in few-shot learning by seeking a higher-level strategy.
arXiv Detail & Related papers (2021-09-13T07:20:01Z) - Autonomous UAV Exploration of Dynamic Environments via Incremental
Sampling and Probabilistic Roadmap [0.3867363075280543]
We propose a novel dynamic exploration planner (DEP) for exploring unknown environments using incremental sampling and Probabilistic Roadmap (PRM)
Our method safely explores dynamic environments and outperforms the benchmark planners in terms of exploration time, path length, and computational time.
arXiv Detail & Related papers (2020-10-14T22:52:37Z) - Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent.
We present a new approach to self-supervised exploration and fast adaptation to new tasks.
Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z) - Flexible and Efficient Long-Range Planning Through Curious Exploration [13.260508939271764]
We show that the Curious Sample Planner can efficiently discover temporally-extended plans for solving a wide range of physically realistic 3D tasks.
In contrast, standard planning and learning methods often fail to solve these tasks at all or do so only with a huge and highly variable number of training samples.
arXiv Detail & Related papers (2020-04-22T21:47:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.