MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement
Learning in Mixed Dynamic Environments
- URL: http://arxiv.org/abs/2007.15724v1
- Date: Thu, 30 Jul 2020 20:14:42 GMT
- Title: MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement
Learning in Mixed Dynamic Environments
- Authors: Zuxin Liu, Baiming Chen, Hongyi Zhou, Guru Koushik, Martial Hebert,
Ding Zhao
- Abstract summary: This paper proposes a decentralized partially observable multi-agent path planning with evolutionary reinforcement learning (MAPPER) method.
We decompose the long-range navigation task into many easier sub-tasks under the guidance of a global planner.
Our approach models dynamic obstacles' behavior with an image-based representation and trains a policy in mixed dynamic environments without homogeneity assumption.
- Score: 30.407700996710023
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent navigation in dynamic environments is of great industrial value
when deploying a large scale fleet of robot to real-world applications. This
paper proposes a decentralized partially observable multi-agent path planning
with evolutionary reinforcement learning (MAPPER) method to learn an effective
local planning policy in mixed dynamic environments. Reinforcement
learning-based methods usually suffer performance degradation on long-horizon
tasks with goal-conditioned sparse rewards, so we decompose the long-range
navigation task into many easier sub-tasks under the guidance of a global
planner, which increases agents' performance in large environments. Moreover,
most existing multi-agent planning approaches assume either perfect information
of the surrounding environment or homogeneity of nearby dynamic agents, which
may not hold in practice. Our approach models dynamic obstacles' behavior with
an image-based representation and trains a policy in mixed dynamic environments
without homogeneity assumption. To ensure multi-agent training stability and
performance, we propose an evolutionary training approach that can be easily
scaled to large and complex environments. Experiments show that MAPPER is able
to achieve higher success rates and more stable performance when exposed to a
large number of non-cooperative dynamic obstacles compared with traditional
reaction-based planner LRA* and the state-of-the-art learning-based method.
Related papers
- Importance Sampling-Guided Meta-Training for Intelligent Agents in Highly Interactive Environments [43.144056801987595]
This study introduces a novel training framework that integrates guided meta RL with importance sampling (IS) to optimize training distributions.
By estimating a naturalistic distribution from real-world datasets, the framework ensures a balanced focus across common and extreme driving scenarios.
arXiv Detail & Related papers (2024-07-22T17:57:12Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent
Pathfinding [16.36594480478895]
Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
arXiv Detail & Related papers (2024-02-23T13:01:13Z) - Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation [50.01551945190676]
Social robot navigation can be helpful in various contexts of daily life but requires safe human-robot interactions and efficient trajectory planning.
We propose a systematic relational reasoning approach with explicit inference of the underlying dynamically evolving relational structures.
We demonstrate its effectiveness for multi-agent trajectory prediction and social robot navigation.
arXiv Detail & Related papers (2024-01-22T18:58:22Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - Reparameterized Policy Learning for Multimodal Trajectory Optimization [61.13228961771765]
We investigate the challenge of parametrizing policies for reinforcement learning in high-dimensional continuous action spaces.
We propose a principled framework that models the continuous RL policy as a generative model of optimal trajectories.
We present a practical model-based RL method, which leverages the multimodal policy parameterization and learned world model.
arXiv Detail & Related papers (2023-07-20T09:05:46Z) - Learning Control Admissibility Models with Graph Neural Networks for
Multi-Agent Navigation [9.05607520128194]
Control admissibility models (CAMs) can be easily composed and used for online inference for an arbitrary number of agents.
We show that the CAM models can be trained in environments with only a few agents and be easily composed for deployment in dense environments with hundreds of agents, achieving better performance than state-of-the-art methods.
arXiv Detail & Related papers (2022-10-17T19:20:58Z) - Exploration via Planning for Information about the Optimal Trajectory [67.33886176127578]
We develop a method that allows us to plan for exploration while taking the task and the current knowledge into account.
We demonstrate that our method learns strong policies with 2x fewer samples than strong exploration baselines.
arXiv Detail & Related papers (2022-10-06T20:28:55Z) - Hierarchical Reinforcement Learning with Opponent Modeling for
Distributed Multi-agent Cooperation [13.670618752160594]
Deep reinforcement learning (DRL) provides a promising approach for multi-agent cooperation through the interaction of the agents and environments.
Traditional DRL solutions suffer from the high dimensions of multiple agents with continuous action space during policy search.
We propose a hierarchical reinforcement learning approach with high-level decision-making and low-level individual control for efficient policy search.
arXiv Detail & Related papers (2022-06-25T19:09:29Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.