Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time
Multi-Robot Cooperative Exploration
- URL: http://arxiv.org/abs/2301.03398v2
- Date: Tue, 11 Apr 2023 07:02:39 GMT
- Title: Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time
Multi-Robot Cooperative Exploration
- Authors: Chao Yu, Xinyi Yang, Jiaxuan Gao, Jiayu Chen, Yunfei Li, Jijia Liu,
Yunfei Xiang, Ruixin Huang, Huazhong Yang, Yi Wu, Yu Wang
- Abstract summary: We consider the problem of cooperative exploration where multiple robots need to cooperatively explore an unknown region as fast as possible.
Existing MARL-based methods adopt action-making steps as the metric for exploration efficiency by assuming all the agents are acting in a fully synchronous manner.
We propose an asynchronous MARL solution, Asynchronous Coordination Explorer (ACE), to tackle this real-world challenge.
- Score: 16.681164058779146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of cooperative exploration where multiple robots need
to cooperatively explore an unknown region as fast as possible. Multi-agent
reinforcement learning (MARL) has recently become a trending paradigm for
solving this challenge. However, existing MARL-based methods adopt
action-making steps as the metric for exploration efficiency by assuming all
the agents are acting in a fully synchronous manner: i.e., every single agent
produces an action simultaneously and every single action is executed
instantaneously at each time step. Despite its mathematical simplicity, such a
synchronous MARL formulation can be problematic for real-world robotic
applications. It can be typical that different robots may take slightly
different wall-clock times to accomplish an atomic action or even periodically
get lost due to hardware issues. Simply waiting for every robot being ready for
the next action can be particularly time-inefficient. Therefore, we propose an
asynchronous MARL solution, Asynchronous Coordination Explorer (ACE), to tackle
this real-world challenge. We first extend a classical MARL algorithm,
multi-agent PPO (MAPPO), to the asynchronous setting and additionally apply
action-delay randomization to enforce the learned policy to generalize better
to varying action delays in the real world. Moreover, each navigation agent is
represented as a team-size-invariant CNN-based policy, which greatly benefits
real-robot deployment by handling possible robot lost and allows
bandwidth-efficient intra-agent communication through low-dimensional CNN
features. We first validate our approach in a grid-based scenario. Both
simulation and real-robot results show that ACE reduces over 10% actual
exploration time compared with classical approaches. We also apply our
framework to a high-fidelity visual-based environment, Habitat, achieving 28%
improvement in exploration efficiency.
Related papers
- MAexp: A Generic Platform for RL-based Multi-Agent Exploration [5.672198570643586]
Existing platforms suffer from the inefficiency in sampling and the lack of diversity in Multi-Agent Reinforcement Learning (MARL) algorithms.
We propose MAexp, a generic platform for multi-agent exploration that integrates a broad range of state-of-the-art MARL algorithms and representative scenarios.
arXiv Detail & Related papers (2024-04-19T12:00:10Z) - Attention Graph for Multi-Robot Social Navigation with Deep
Reinforcement Learning [0.0]
We present MultiSoc, a new method for learning multi-agent socially aware navigation strategies using deep reinforcement learning (RL)
Inspired by recent works on multi-agent deep RL, our method leverages graph-based representation of agent interactions, combining the positions and fields of view of entities (pedestrians and agents)
Our method learns faster than social navigation deep RL mono-agent techniques, and enables efficient multi-agent implicit coordination in challenging crowd navigation with multiple heterogeneous humans.
arXiv Detail & Related papers (2024-01-31T15:24:13Z) - ACE: Cooperative Multi-agent Q-learning with Bidirectional
Action-Dependency [65.28061634546577]
Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem.
In this paper, we propose bidirectional action-dependent Q-learning (ACE)
ACE outperforms the state-of-the-art algorithms on Google Research Football and StarCraft Multi-Agent Challenge.
arXiv Detail & Related papers (2022-11-29T10:22:55Z) - Multi-robot Social-aware Cooperative Planning in Pedestrian Environments
Using Multi-agent Reinforcement Learning [2.7716102039510564]
We propose a novel multi-robot social-aware efficient cooperative planner that on the basis of off-policy multi-agent reinforcement learning (MARL)
We adopt temporal-spatial graph (TSG)-based social encoder to better extract the importance of social relation between each robot and the pedestrians in its field of view (FOV)
arXiv Detail & Related papers (2022-11-29T03:38:47Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - From Multi-agent to Multi-robot: A Scalable Training and Evaluation
Platform for Multi-robot Reinforcement Learning [12.74238738538799]
Multi-agent reinforcement learning (MARL) has been gaining extensive attention from academia and industries in the past few decades.
It remains unknown how these methods perform in real-world scenarios, especially multi-robot systems.
This paper introduces a scalable emulation platform for multi-robot reinforcement learning (MRRL) called SMART to meet this need.
arXiv Detail & Related papers (2022-06-20T06:36:45Z) - Off-Beat Multi-Agent Reinforcement Learning [62.833358249873704]
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent.
We propose a novel episodic memory, LeGEM, for model-free MARL algorithms.
We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2022-05-27T02:21:04Z) - Intelligent Trajectory Design for RIS-NOMA aided Multi-robot
Communications [59.34642007625687]
The goal is to maximize the sum-rate of whole trajectories for multi-robot system by jointly optimizing trajectories and NOMA decoding orders of robots.
An integrated machine learning (ML) scheme is proposed, which combines long short-term memory (LSTM)-autoregressive integrated moving average (ARIMA) model and dueling double deep Q-network (D$3$QN) algorithm.
arXiv Detail & Related papers (2022-05-03T17:14:47Z) - SABER: Data-Driven Motion Planner for Autonomously Navigating
Heterogeneous Robots [112.2491765424719]
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal.
We use model predictive control (SMPC) to calculate control inputs that satisfy robot dynamics, and consider uncertainty during obstacle avoidance with chance constraints.
recurrent neural networks are used to provide a quick estimate of future state uncertainty considered in the SMPC finite-time horizon solution.
A Deep Q-learning agent is employed to serve as a high-level path planner, providing the SMPC with target positions that move the robots towards a desired global goal.
arXiv Detail & Related papers (2021-08-03T02:56:21Z) - Loosely Synchronized Search for Multi-agent Path Finding with
Asynchronous Actions [10.354181009277623]
Multi-agent path finding (MAPF) determines an ensemble of collision-free paths for multiple agents between their respective start and goal locations.
This article presents a natural generalization of MAPF with asynchronous actions where agents do not necessarily start and stop concurrently.
arXiv Detail & Related papers (2021-03-08T02:34:17Z) - ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for
Mobile Manipulation [99.2543521972137]
ReLMoGen is a framework that combines a learned policy to predict subgoals and a motion generator to plan and execute the motion needed to reach these subgoals.
Our method is benchmarked on a diverse set of seven robotics tasks in photo-realistic simulation environments.
ReLMoGen shows outstanding transferability between different motion generators at test time, indicating a great potential to transfer to real robots.
arXiv Detail & Related papers (2020-08-18T08:05:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.