Accelerated Sim-to-Real Deep Reinforcement Learning: Learning Collision
Avoidance from Human Player
- URL: http://arxiv.org/abs/2102.10711v2
- Date: Tue, 23 Feb 2021 02:44:21 GMT
- Title: Accelerated Sim-to-Real Deep Reinforcement Learning: Learning Collision
Avoidance from Human Player
- Authors: Hanlin Niu, Ze Ji, Farshad Arvin, Barry Lennox, Hujun Yin, and Joaquin
Carrasco
- Abstract summary: This paper presents a sensor-level mapless collision avoidance algorithm for use in mobile robots.
An efficient training strategy is proposed to allow a robot to learn from both human experience data and self-exploratory data.
A game format simulation framework is designed to allow the human player to tele-operate the mobile robot to a goal.
- Score: 5.960346570280513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a sensor-level mapless collision avoidance algorithm for
use in mobile robots that map raw sensor data to linear and angular velocities
and navigate in an unknown environment without a map. An efficient training
strategy is proposed to allow a robot to learn from both human experience data
and self-exploratory data. A game format simulation framework is designed to
allow the human player to tele-operate the mobile robot to a goal and human
action is also scored using the reward function. Both human player data and
self-playing data are sampled using prioritized experience replay algorithm.
The proposed algorithm and training strategy have been evaluated in two
different experimental configurations: \textit{Environment 1}, a simulated
cluttered environment, and \textit{Environment 2}, a simulated corridor
environment, to investigate the performance. It was demonstrated that the
proposed method achieved the same level of reward using only 16\% of the
training steps required by the standard Deep Deterministic Policy Gradient
(DDPG) method in Environment 1 and 20\% of that in Environment 2. In the
evaluation of 20 random missions, the proposed method achieved no collision in
less than 2~h and 2.5~h of training time in the two Gazebo environments
respectively. The method also generated smoother trajectories than DDPG. The
proposed method has also been implemented on a real robot in the real-world
environment for performance evaluation. We can confirm that the trained model
with the simulation software can be directly applied into the real-world
scenario without further fine-tuning, further demonstrating its higher
robustness than DDPG. The video and code are available:
https://youtu.be/BmwxevgsdGc
https://github.com/hanlinniu/turtlebot3_ddpg_collision_avoidance
Related papers
- Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Robust Visual Sim-to-Real Transfer for Robotic Manipulation [79.66851068682779]
Learning visuomotor policies in simulation is much safer and cheaper than in the real world.
However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots.
One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR)
arXiv Detail & Related papers (2023-07-28T05:47:24Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - Quality-Diversity Optimisation on a Physical Robot Through
Dynamics-Aware and Reset-Free Learning [4.260312058817663]
We build upon the Reset-Free QD (RF-QD) algorithm to learn controllers directly on a physical robot.
This method uses a dynamics model, learned from interactions between the robot and the environment, to predict the robot's behaviour.
RF-QD also includes a recovery policy that returns the robot to a safe zone when it has walked outside of it, allowing continuous learning.
arXiv Detail & Related papers (2023-04-24T13:24:00Z) - Obstacle Avoidance for Robotic Manipulator in Joint Space via Improved
Proximal Policy Optimization [6.067589886362815]
In this paper, we train a deep neural network via an improved Proximal Policy Optimization (PPO) algorithm to map from task space to joint space for a 6-DoF manipulator.
Since training such a task in real-robot is time-consuming and strenuous, we develop a simulation environment to train the model.
Experimental results showed that using our method, the robot was capable of tracking a single target or reaching multiple targets in unstructured environments.
arXiv Detail & Related papers (2022-10-03T10:21:57Z) - Off Environment Evaluation Using Convex Risk Minimization [0.0]
We propose a convex risk minimization algorithm to estimate the model mismatch between the simulator and the target domain.
We show that this estimator can be used along with the simulator to evaluate performance of an RL agents in the target domain.
arXiv Detail & Related papers (2021-12-21T21:31:54Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - Passing Through Narrow Gaps with Deep Reinforcement Learning [2.299414848492227]
In this paper we present a deep reinforcement learning method for autonomously navigating through small gaps.
We first learn a gap behaviour policy to get through small gaps, where contact between the robot and the gap may be required.
In simulation experiments, our approach achieves 93% success rate when the gap behaviour is activated manually by an operator.
In real robot experiments, our approach achieves a success rate of 73% with manual activation, and 40% with autonomous behaviour selection.
arXiv Detail & Related papers (2021-03-06T00:10:41Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Reactive Long Horizon Task Execution via Visual Skill and Precondition
Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner.
We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z) - On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning
and SLAM Based Approach [7.488722678999039]
We present a map-less path planning algorithm based on Deep Reinforcement Learning (DRL) for mobile robots navigating in unknown environment.
The planner is trained using a reward function shaped based on the online knowledge of the map of the training environment.
The policy trained in the simulation environment can be directly and successfully transferred to the real robot.
arXiv Detail & Related papers (2020-02-10T22:00:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.