Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with
Reward Shaping
- URL: http://arxiv.org/abs/2003.12863v2
- Date: Fri, 10 Apr 2020 00:56:57 GMT
- Title: Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with
Reward Shaping
- Authors: Daniel Zhang, Colleen P. Bailey
- Abstract summary: We propose revised Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization algorithms with an improved reward shaping technique.
We compare the performances between the original DDPG and PPO with the revised version of both on simulations with a real mobile robot and demonstrate that the proposed algorithms achieve better results.
- Score: 7.132368785057316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the obstacle avoidance and navigation problem
in the robotic control area. For solving such a problem, we propose revised
Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization
algorithms with an improved reward shaping technique. We compare the
performances between the original DDPG and PPO with the revised version of both
on simulations with a real mobile robot and demonstrate that the proposed
algorithms achieve better results.
Related papers
- Research on reinforcement learning based warehouse robot navigation algorithm in complex warehouse layout [13.945240113332352]
This paper proposes a new method of Proximal Policy Optimization (PPO) and Dijkstra's algorithm, Proximal policy-Dijkstra (PP-D)
PP-D method realizes efficient strategy learning and real-time decision making through PPO, and uses Dijkstra algorithm to plan the global optimal path.
arXiv Detail & Related papers (2024-11-09T09:44:03Z) - Deep Reinforcement Learning for Online Optimal Execution Strategies [49.1574468325115]
This paper tackles the challenge of learning non-Markovian optimal execution strategies in dynamic financial markets.
We introduce a novel actor-critic algorithm based on Deep Deterministic Policy Gradient (DDPG)
We show that our algorithm successfully approximates the optimal execution strategy.
arXiv Detail & Related papers (2024-10-17T12:38:08Z) - Autonomous Navigation of Unmanned Vehicle Through Deep Reinforcement Learning [1.3725832537448668]
The paper details the model of a Ackermann robot and the structure and application of the DDPG algorithm.
The results demonstrate that the DDPG algorithm outperforms traditional Deep Q-Network (DQN) and Double Deep Q-Network (DDQN) algorithms in path planning tasks.
arXiv Detail & Related papers (2024-07-18T05:18:59Z) - An Improved Artificial Fish Swarm Algorithm for Solving the Problem of
Investigation Path Planning [8.725702964289479]
We propose a chaotic artificial fish swarm algorithm based on multiple population differential evolution (DE-CAFSA)
We incorporate adaptive field of view and step size adjustments, replace random behavior with the 2-opt operation, and introduce chaos theory and sub-optimal solutions.
Experimental results demonstrate that DE-CAFSA outperforms other algorithms on various public datasets of different sizes.
arXiv Detail & Related papers (2023-10-20T09:35:51Z) - Reinforcement Learning for Robot Navigation with Adaptive Forward
Simulation Time (AFST) in a Semi-Markov Model [20.91419349793292]
We propose the first DRL-based navigation method modeled by a semi-Markov decision process (SMDP) with continuous action space, named Adaptive Forward Time Simulation (AFST) to overcome this problem.
arXiv Detail & Related papers (2021-08-13T10:30:25Z) - Generative Actor-Critic: An Off-policy Algorithm Using the Push-forward
Model [24.030426634281643]
In continuous control tasks, widely used policies with Gaussian distributions results in ineffective exploration of environments.
We propose a density-free off-policy algorithm, Generative Actor-Critic, using the push-forward model to increase the expressiveness of policies.
We show that push-forward policies possess desirable features, such as multi-modality, which can improve the efficiency of exploration and performance of algorithms obviously.
arXiv Detail & Related papers (2021-05-08T16:29:20Z) - XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision
Trees [55.9643422180256]
We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments.
Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm.
We highlight the benefits of our algorithm in simulated environments and navigating a Clearpath Jackal robot among moving pedestrians.
arXiv Detail & Related papers (2021-04-22T01:33:10Z) - Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms.
D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions.
We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z) - Logistic Q-Learning [87.00813469969167]
We propose a new reinforcement learning algorithm derived from a regularized linear-programming formulation of optimal control in MDPs.
The main feature of our algorithm is a convex loss function for policy evaluation that serves as a theoretically sound alternative to the widely used squared Bellman error.
arXiv Detail & Related papers (2020-10-21T17:14:31Z) - Implementation Matters in Deep Policy Gradients: A Case Study on PPO and
TRPO [90.90009491366273]
We study the roots of algorithmic progress in deep policy gradient algorithms through a case study on two popular algorithms.
Specifically, we investigate the consequences of "code-level optimizations:"
Our results show that they (a) are responsible for most of PPO's gain in cumulative reward over TRPO, and (b) fundamentally change how RL methods function.
arXiv Detail & Related papers (2020-05-25T16:24:59Z) - Optimization-driven Deep Reinforcement Learning for Robust Beamforming
in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver.
We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming.
We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.