On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning
and SLAM Based Approach
- URL: http://arxiv.org/abs/2002.04109v1
- Date: Mon, 10 Feb 2020 22:00:16 GMT
- Title: On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning
and SLAM Based Approach
- Authors: Nicol\`o Botteghi, Beril Sirmacek, Khaled A. A. Mustafa, Mannes Poel
and Stefano Stramigioli
- Abstract summary: We present a map-less path planning algorithm based on Deep Reinforcement Learning (DRL) for mobile robots navigating in unknown environment.
The planner is trained using a reward function shaped based on the online knowledge of the map of the training environment.
The policy trained in the simulation environment can be directly and successfully transferred to the real robot.
- Score: 7.488722678999039
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a map-less path planning algorithm based on Deep Reinforcement
Learning (DRL) for mobile robots navigating in unknown environment that only
relies on 40-dimensional raw laser data and odometry information. The planner
is trained using a reward function shaped based on the online knowledge of the
map of the training environment, obtained using grid-based Rao-Blackwellized
particle filter, in an attempt to enhance the obstacle awareness of the agent.
The agent is trained in a complex simulated environment and evaluated in two
unseen ones. We show that the policy trained using the introduced reward
function not only outperforms standard reward functions in terms of convergence
speed, by a reduction of 36.9\% of the iteration steps, and reduction of the
collision samples, but it also drastically improves the behaviour of the agent
in unseen environments, respectively by 23\% in a simpler workspace and by 45\%
in a more clustered one. Furthermore, the policy trained in the simulation
environment can be directly and successfully transferred to the real robot. A
video of our experiments can be found at: https://youtu.be/UEV7W6e6ZqI
Related papers
- FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning [74.25049012472502]
FLaRe is a large-scale Reinforcement Learning framework that integrates robust pre-trained representations, large-scale training, and gradient stabilization techniques.
Our method aligns pre-trained policies towards task completion, achieving state-of-the-art (SoTA) performance on previously demonstrated and on entirely novel tasks and embodiments.
arXiv Detail & Related papers (2024-09-25T03:15:17Z) - Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks [93.38375271826202]
We present a method to improve generalization and robustness to distribution shifts in sim-to-real visual quadrotor navigation tasks.
We first build a simulator by integrating Gaussian splatting with quadrotor flight dynamics, and then, train robust navigation policies using Liquid neural networks.
In this way, we obtain a full-stack imitation learning protocol that combines advances in 3D Gaussian splatting radiance field rendering, programming of expert demonstration training data, and the task understanding capabilities of Liquid networks.
arXiv Detail & Related papers (2024-06-21T13:48:37Z) - Deep Reinforcement Learning with Dynamic Graphs for Adaptive Informative Path Planning [22.48658555542736]
Key task in robotic data acquisition is planning paths through an initially unknown environment to collect observations.
We propose a novel deep reinforcement learning approach for adaptively replanning robot paths to map targets of interest in unknown 3D environments.
arXiv Detail & Related papers (2024-02-07T14:24:41Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - VAPOR: Legged Robot Navigation in Outdoor Vegetation Using Offline
Reinforcement Learning [53.13393315664145]
We present VAPOR, a novel method for autonomous legged robot navigation in unstructured, densely vegetated outdoor environments.
Our method trains a novel RL policy using an actor-critic network and arbitrary data collected in real outdoor vegetation.
We observe that VAPOR's actions improve success rates by up to 40%, decrease the average current consumption by up to 2.9%, and decrease the normalized trajectory length by up to 11.2%.
arXiv Detail & Related papers (2023-09-14T16:21:27Z) - XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision
Trees [55.9643422180256]
We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments.
Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm.
We highlight the benefits of our algorithm in simulated environments and navigating a Clearpath Jackal robot among moving pedestrians.
arXiv Detail & Related papers (2021-04-22T01:33:10Z) - Accelerated Sim-to-Real Deep Reinforcement Learning: Learning Collision
Avoidance from Human Player [5.960346570280513]
This paper presents a sensor-level mapless collision avoidance algorithm for use in mobile robots.
An efficient training strategy is proposed to allow a robot to learn from both human experience data and self-exploratory data.
A game format simulation framework is designed to allow the human player to tele-operate the mobile robot to a goal.
arXiv Detail & Related papers (2021-02-21T23:27:34Z) - An A* Curriculum Approach to Reinforcement Learning for RGBD Indoor
Robot Navigation [6.660458629649825]
Recently released photo-realistic simulators such as Habitat allow for the training of networks that output control actions directly from perception.
Our paper tries to overcome this problem by separating the training of the perception and control neural nets and increasing the path complexity gradually.
arXiv Detail & Related papers (2021-01-05T20:35:14Z) - Visual Navigation in Real-World Indoor Environments Using End-to-End
Deep Reinforcement Learning [2.7071541526963805]
We propose a novel approach that enables a direct deployment of the trained policy on real robots.
The policy is fine-tuned on images collected from real-world environments.
In 30 navigation experiments, the robot reached a 0.3-meter neighborhood of the goal in more than 86.7% of cases.
arXiv Detail & Related papers (2020-10-21T11:22:30Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z) - Reward Engineering for Object Pick and Place Training [3.4806267677524896]
We have used the Pick and Place environment provided by OpenAI's Gym to engineer rewards.
In the default configuration of the OpenAI baseline and environment the reward function is calculated using the distance between the target location and the robot end-effector.
We were also able to introduce certain user desired trajectories in the learnt policies.
arXiv Detail & Related papers (2020-01-11T20:13:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.