Generalization in Deep Reinforcement Learning for Robotic Navigation by
Reward Shaping
- URL: http://arxiv.org/abs/2209.14271v2
- Date: Sat, 26 Aug 2023 14:07:48 GMT
- Title: Generalization in Deep Reinforcement Learning for Robotic Navigation by
Reward Shaping
- Authors: Victor R. F. Miranda, Armando A. Neto, Gustavo M. Freitas, Leonardo A.
Mozelli
- Abstract summary: We study the application of DRL algorithms in the context of local navigation problems.
Collision avoidance policies based on DRL present some advantages, but they are quite susceptible to local minima.
We propose a novel reward function that incorporates map information gained in the training stage, increasing the agent's capacity to deliberate about the best course of action.
- Score: 0.1588748438612071
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we study the application of DRL algorithms in the context of
local navigation problems, in which a robot moves towards a goal location in
unknown and cluttered workspaces equipped only with limited-range exteroceptive
sensors, such as LiDAR. Collision avoidance policies based on DRL present some
advantages, but they are quite susceptible to local minima, once their capacity
to learn suitable actions is limited to the sensor range. Since most robots
perform tasks in unstructured environments, it is of great interest to seek
generalized local navigation policies capable of avoiding local minima,
especially in untrained scenarios. To do so, we propose a novel reward function
that incorporates map information gained in the training stage, increasing the
agent's capacity to deliberate about the best course of action. Also, we use
the SAC algorithm for training our ANN, which shows to be more effective than
others in the state-of-the-art literature. A set of sim-to-sim and sim-to-real
experiments illustrate that our proposed reward combined with the SAC
outperforms the compared methods in terms of local minima and collision
avoidance.
Related papers
- When to Localize? A Risk-Constrained Reinforcement Learning Approach [13.853127103435012]
In some scenarios, a robot needs to selectively localize when it is expensive to obtain observations.
RiskRL is a constrained Reinforcement Learning framework that overcomes these limitations.
arXiv Detail & Related papers (2024-11-05T03:54:00Z) - Mission-driven Exploration for Accelerated Deep Reinforcement Learning
with Temporal Logic Task Specifications [11.812602599752294]
We consider robots with unknown dynamics operating in environments with unknown structure.
Our goal is to synthesize a control policy that maximizes the probability of satisfying an automaton-encoded task.
We propose a novel DRL algorithm, which has the capability to learn control policies at a notably faster rate compared to similar methods.
arXiv Detail & Related papers (2023-11-28T18:59:58Z) - CCE: Sample Efficient Sparse Reward Policy Learning for Robotic Navigation via Confidence-Controlled Exploration [72.24964965882783]
Confidence-Controlled Exploration (CCE) is designed to enhance the training sample efficiency of reinforcement learning algorithms for sparse reward settings such as robot navigation.
CCE is based on a novel relationship we provide between gradient estimation and policy entropy.
We demonstrate through simulated and real-world experiments that CCE outperforms conventional methods that employ constant trajectory lengths and entropy regularization.
arXiv Detail & Related papers (2023-06-09T18:45:15Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - DDPEN: Trajectory Optimisation With Sub Goal Generation Model [70.36888514074022]
In this paper, we produce a novel Differential Dynamic Programming with Escape Network (DDPEN)
We propose to utilize a deep model that takes as an input map of the environment in the form of a costmap together with the desired position.
The model produces possible future directions that will lead to the goal, avoiding local minima which is possible to run in real time conditions.
arXiv Detail & Related papers (2023-01-18T11:02:06Z) - Verifying Learning-Based Robotic Navigation Systems [61.01217374879221]
We show how modern verification engines can be used for effective model selection.
Specifically, we use verification to detect and rule out policies that may demonstrate suboptimal behavior.
Our work is the first to demonstrate the use of verification backends for recognizing suboptimal DRL policies in real-world robots.
arXiv Detail & Related papers (2022-05-26T17:56:43Z) - MADE: Exploration via Maximizing Deviation from Explored Regions [48.49228309729319]
In online reinforcement learning (RL), efficient exploration remains challenging in high-dimensional environments with sparse rewards.
We propose a new exploration approach via textitmaximizing the deviation of the occupancy of the next policy from the explored regions.
Our approach significantly improves sample efficiency over state-of-the-art methods.
arXiv Detail & Related papers (2021-06-18T17:57:00Z) - Rule-Based Reinforcement Learning for Efficient Robot Navigation with
Space Reduction [8.279526727422288]
In this paper, we focus on efficient navigation with the reinforcement learning (RL) technique.
We employ a reduction rule to shrink the trajectory, which in turn effectively reduces the redundant exploration space.
Experiments conducted on real robot navigation problems in hex-grid environments demonstrate that RuRL can achieve improved navigation performance.
arXiv Detail & Related papers (2021-04-15T07:40:27Z) - On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning
and SLAM Based Approach [7.488722678999039]
We present a map-less path planning algorithm based on Deep Reinforcement Learning (DRL) for mobile robots navigating in unknown environment.
The planner is trained using a reward function shaped based on the online knowledge of the map of the training environment.
The policy trained in the simulation environment can be directly and successfully transferred to the real robot.
arXiv Detail & Related papers (2020-02-10T22:00:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.