Related papers: An Improved Algorithm of Robot Path Planning in Complex Environment Based on Double DQN

An Improved Algorithm of Robot Path Planning in Complex Environment Based on Double DQN

URL: http://arxiv.org/abs/2107.11245v1
Date: Fri, 23 Jul 2021 14:03:04 GMT
Title: An Improved Algorithm of Robot Path Planning in Complex Environment Based on Double DQN
Authors: Fei Zhang, Chaochen Gu, and Feng Yang
Abstract summary: This paper proposes an improved Double DQN (DDQN) to solve the problem by reference to A* and Rapidly-Exploring Random Tree (RRT) The simulation experimental results validate the efficiency of the improved DDQN.
Score: 4.161177874372099
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Q Network (DQN) has several limitations when applied in planning a path in environment with a number of dilemmas according to our experiment. The reward function may be hard to model, and successful experience transitions are difficult to find in experience replay. In this context, this paper proposes an improved Double DQN (DDQN) to solve the problem by reference to A* and Rapidly-Exploring Random Tree (RRT). In order to achieve the rich experiments in experience replay, the initialization of robot in each training round is redefined based on RRT strategy. In addition, reward for the free positions is specially designed to accelerate the learning process according to the definition of position cost in A*. The simulation experimental results validate the efficiency of the improved DDQN, and robot could successfully learn the ability of obstacle avoidance and optimal path planning in which DQN or DDQN has no effect.

Related papers

Back-stepping Experience Replay with Application to Model-free Reinforcement Learning for a Soft Snake Robot [15.005962159112002]
Back-stepping Experience Replay (BER) is compatible with arbitrary off-policy reinforcement learning algorithms. We present an application of BER in a model-free RL approach for the locomotion and navigation of a soft snake robot.
arXiv Detail & Related papers (2024-01-21T02:17:16Z)
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world. Recent methods aim to mitigate misalignment by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
M$^2$DQN: A Robust Method for Accelerating Deep Q-learning Network [6.689964384669018]
We propose a framework which uses the Max-Mean loss in Deep Q-Network (M$2$DQN) Instead of sampling one batch of experiences in the training step, we sample several batches from the experience replay and update the parameters such as the maximum TD-error of these batches is minimized. We verify the effectiveness of this framework with one of the most widely used techniques, Double DQN (DDQN) in several gym games.
arXiv Detail & Related papers (2022-09-16T09:20:35Z)
Relay Hindsight Experience Replay: Continual Reinforcement Learning for Robot Manipulation Tasks with Sparse Rewards [26.998587654269873]
We propose a novel model-free continual RL algorithm, called Relay-HER (RHER) The proposed method first decomposes and rearranges the original long-horizon task into new sub-tasks with incremental complexity. The experimental results indicate the significant improvements in sample efficiency of RHER compared to vanilla-HER in five typical robot manipulation tasks.
arXiv Detail & Related papers (2022-08-01T13:30:01Z)
Adaptive Self-supervision Algorithms for Physics-informed Neural Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function. We study the impact of the location of the collocation points on the trainability of these models. We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z)
Learning to Walk Autonomously via Reset-Free Quality-Diversity [73.08073762433376]
Quality-Diversity algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills. Existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions. This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments.
arXiv Detail & Related papers (2022-04-07T14:07:51Z)
Deep Reinforcement Learning with Spiking Q-learning [51.386945803485084]
spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption. It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL)
arXiv Detail & Related papers (2022-01-21T16:42:11Z)
Few-shot Quality-Diversity Optimization [50.337225556491774]
Quality-Diversity (QD) optimization has been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning. We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments, allows for few-shot adaptation. Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimization in these environments.
arXiv Detail & Related papers (2021-09-14T17:12:20Z)
RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering [55.280108297460636]
In open-domain question answering, dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers. We propose an optimized training approach, called RocketQA, to improve dense passage retrieval. We make three major technical contributions in RocketQA, namely cross-batch negatives, denoised hard negatives and data augmentation.
arXiv Detail & Related papers (2020-10-16T06:54:05Z)
Hindsight Experience Replay with Kronecker Product Approximate Curvature [5.441932327359051]
Hindsight Experience Replay (HER) is one of the efficient algorithm to solve Reinforcement Learning tasks. But due to its reduced sample efficiency and slower convergence HER fails to perform effectively. Natural gradients solves these challenges by converging the model parameters better. Our proposed method solves the above mentioned challenges with better sample efficiency and faster convergence with increased success rate.
arXiv Detail & Related papers (2020-10-09T20:25:14Z)
Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient [51.880464915253924]
Deep Q-learning algorithms often suffer from poor gradient estimations with an excessive variance. This paper introduces the framework for updating the gradient estimates in deep Q-learning, achieving a novel algorithm called SRG-DQN.
arXiv Detail & Related papers (2020-07-25T00:54:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.