Robot Navigation in a Crowd by Integrating Deep Reinforcement Learning
and Online Planning
- URL: http://arxiv.org/abs/2102.13265v1
- Date: Fri, 26 Feb 2021 02:17:13 GMT
- Title: Robot Navigation in a Crowd by Integrating Deep Reinforcement Learning
and Online Planning
- Authors: Zhiqian Zhou, Pengming Zhu, Zhiwen Zeng, Junhao Xiao, Huimin Lu,
Zongtan Zhou
- Abstract summary: It is still an open and challenging problem for mobile robots navigating along time-efficient and collision-free paths in a crowd.
Deep reinforcement learning is a promising solution to this problem.
We propose a graph-based deep reinforcement learning method, SG-DQN.
Our model can help the robot better understand the crowd and achieve a high success rate of more than 0.99 in the crowd navigation task.
- Score: 8.211771115758381
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: It is still an open and challenging problem for mobile robots navigating
along time-efficient and collision-free paths in a crowd. The main challenge
comes from the complex and sophisticated interaction mechanism, which requires
the robot to understand the crowd and perform proactive and foresighted
behaviors. Deep reinforcement learning is a promising solution to this problem.
However, most previous learning methods incur a tremendous computational
burden. To address these problems, we propose a graph-based deep reinforcement
learning method, SG-DQN, that (i) introduces a social attention mechanism to
extract an efficient graph representation for the crowd-robot state; (ii)
directly evaluates the coarse q-values of the raw state with a learned dueling
deep Q network(DQN); and then (iii) refines the coarse q-values via online
planning on possible future trajectories. The experimental results indicate
that our model can help the robot better understand the crowd and achieve a
high success rate of more than 0.99 in the crowd navigation task. Compared
against previous state-of-the-art algorithms, our algorithm achieves an
equivalent, if not better, performance while requiring less than half of the
computational cost.
Related papers
- Multi-Objective Algorithms for Learning Open-Ended Robotic Problems [1.0124625066746598]
Quadrupedal locomotion is a complex, open-ended problem vital to expanding autonomous vehicle reach.
Traditional reinforcement learning approaches often fall short due to training instability and sample inefficiency.
We propose a novel method leveraging multi-objective evolutionary algorithms as an automatic curriculum learning mechanism.
arXiv Detail & Related papers (2024-11-11T16:26:42Z) - Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills.
We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval.
GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z) - On-Robot Bayesian Reinforcement Learning for POMDPs [16.667924736270415]
This paper advances Bayesian reinforcement learning for robotics by proposing a specialized framework for physical systems.
We capture this knowledge in a factored representation, then demonstrate the posterior factorizes in a similar shape, and ultimately formalize the model in a Bayesian framework.
We then introduce a sample-based online solution method, based on Monte-Carlo tree search and particle filtering, specialized to solve the resulting model.
arXiv Detail & Related papers (2023-07-22T01:16:29Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - TransPath: Learning Heuristics For Grid-Based Pathfinding via
Transformers [64.88759709443819]
We suggest learning the instance-dependent proxies that are supposed to notably increase the efficiency of the search.
The first proxy we suggest to learn is the correction factor, i.e. the ratio between the instance independent cost-to-go estimate and the perfect one.
The second proxy is the path probability, which indicates how likely the grid cell is lying on the shortest path.
arXiv Detail & Related papers (2022-12-22T14:26:11Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - Divide & Conquer Imitation Learning [75.31752559017978]
Imitation Learning can be a powerful approach to bootstrap the learning process.
We present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory.
We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.
arXiv Detail & Related papers (2022-04-15T09:56:50Z) - Centralizing State-Values in Dueling Networks for Multi-Robot
Reinforcement Learning Mapless Navigation [87.85646257351212]
We study the problem of multi-robot mapless navigation in the popular Training and Decentralized Execution (CTDE) paradigm.
This problem is challenging when each robot considers its path without explicitly sharing observations with other robots.
We propose a novel architecture for CTDE that uses a centralized state-value network to compute a joint state-value.
arXiv Detail & Related papers (2021-12-16T16:47:00Z) - How to reduce computation time while sparing performance during robot
navigation? A neuro-inspired architecture for autonomous shifting between
model-based and model-free learning [1.3854111346209868]
We present a novel arbitration mechanism between learning systems that explicitly measures performance and cost.
We find that the robot can adapt to environment changes by switching between learning systems so as to maintain a high performance.
When the task is stable, the robot also autonomously shifts to the least costly system, which leads to a drastic reduction in computation cost while keeping a high performance.
arXiv Detail & Related papers (2020-04-30T11:29:16Z) - Leveraging Rationales to Improve Human Task Performance [15.785125079811902]
Given a computational system's performance exceeds that of its human user, can explainable AI capabilities be leveraged to improve the performance of the human?
We introduce the Rationale-Generating Algorithm, an automated technique for generating rationales for utility-based computational methods.
Results show that our approach produces rationales that lead to statistically significant improvement in human task performance.
arXiv Detail & Related papers (2020-02-11T04:51:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.