Reinforcement Learning for Robot Navigation with Adaptive Forward
Simulation Time (AFST) in a Semi-Markov Model
- URL: http://arxiv.org/abs/2108.06161v4
- Date: Tue, 4 Jul 2023 12:43:55 GMT
- Title: Reinforcement Learning for Robot Navigation with Adaptive Forward
Simulation Time (AFST) in a Semi-Markov Model
- Authors: Yu'an Chen, Ruosong Ye, Ziyang Tao, Hongjian Liu, Guangda Chen, Jie
Peng, Jun Ma, Yu Zhang, Jianmin Ji and Yanyong Zhang
- Abstract summary: We propose the first DRL-based navigation method modeled by a semi-Markov decision process (SMDP) with continuous action space, named Adaptive Forward Time Simulation (AFST) to overcome this problem.
- Score: 20.91419349793292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep reinforcement learning (DRL) algorithms have proven effective in robot
navigation, especially in unknown environments, by directly mapping perception
inputs into robot control commands. However, most existing methods ignore the
local minimum problem in navigation and thereby cannot handle complex unknown
environments. In this paper, we propose the first DRL-based navigation method
modeled by a semi-Markov decision process (SMDP) with continuous action space,
named Adaptive Forward Simulation Time (AFST), to overcome this problem.
Specifically, we reduce the dimensions of the action space and improve the
distributed proximal policy optimization (DPPO) algorithm for the specified
SMDP problem by modifying its GAE to better estimate the policy gradient in
SMDPs. Experiments in various unknown environments demonstrate the
effectiveness of AFST.
Related papers
- Deep-Sea A*+: An Advanced Path Planning Method Integrating Enhanced A* and Dynamic Window Approach for Autonomous Underwater Vehicles [1.3807821497779342]
Extreme conditions in the deep-sea environment pose significant challenges for underwater operations.
We propose an advanced path planning methodology that integrates an improved A* algorithm with the Dynamic Window Approach (DWA)
Our proposed method surpasses the traditional A* algorithm in terms of path smoothness, obstacle avoidance, and real-time performance.
arXiv Detail & Related papers (2024-10-22T07:29:05Z) - Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains.
Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint.
This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions.
The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Guidance Design for Escape Flight Vehicle Using Evolution Strategy Enhanced Deep Reinforcement Learning [6.037202026682975]
We consider the scenario where the escape flight vehicle (EFV) generates guidance commands based on DRL and the pursuit flight vehicle (PFV) generates guidance commands based on the proportional navigation method.
For the EFV, the objective of the guidance design entails progressively maximizing the residual velocity, subject to the constraint imposed by the given evasion distance.
In the first step, we use the proximal policy optimization (PPO) algorithm to generate the guidance commands of the EFV.
In the second step, we propose to invoke the evolution strategy (ES) based algorithm, which uses the result of PPO as the
arXiv Detail & Related papers (2024-05-04T06:18:15Z) - Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance [0.0]
Deep Reinforcement Learning (DRL) has emerged as a promising control framework.
Current DRL algorithms require disproportionally large computational resources to find near-optimal policies.
This paper presents a comprehensive exploration of our proposed approach in maritime control systems.
arXiv Detail & Related papers (2024-03-31T09:25:28Z) - GP-guided MPPI for Efficient Navigation in Complex Unknown Cluttered
Environments [2.982218441172364]
This study presents the GP-MPPI, an online learning-based control strategy that integrates Model Predictive Path Intergal (MPPI) with a local perception model.
We validate the efficiency and robustness of our proposed control strategy through both simulated and real-world experiments of 2D autonomous navigation tasks.
arXiv Detail & Related papers (2023-07-08T17:33:20Z) - DDPEN: Trajectory Optimisation With Sub Goal Generation Model [70.36888514074022]
In this paper, we produce a novel Differential Dynamic Programming with Escape Network (DDPEN)
We propose to utilize a deep model that takes as an input map of the environment in the form of a costmap together with the desired position.
The model produces possible future directions that will lead to the goal, avoiding local minima which is possible to run in real time conditions.
arXiv Detail & Related papers (2023-01-18T11:02:06Z) - Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm.
We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z) - Learning Robust Policy against Disturbance in Transition Dynamics via
State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments.
We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance.
Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z) - SABER: Data-Driven Motion Planner for Autonomously Navigating
Heterogeneous Robots [112.2491765424719]
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal.
We use model predictive control (SMPC) to calculate control inputs that satisfy robot dynamics, and consider uncertainty during obstacle avoidance with chance constraints.
recurrent neural networks are used to provide a quick estimate of future state uncertainty considered in the SMPC finite-time horizon solution.
A Deep Q-learning agent is employed to serve as a high-level path planner, providing the SMPC with target positions that move the robots towards a desired global goal.
arXiv Detail & Related papers (2021-08-03T02:56:21Z) - Learning Sampling Policy for Faster Derivative Free Optimization [100.27518340593284]
We propose a new reinforcement learning based ZO algorithm (ZO-RL) with learning the sampling policy for generating the perturbations in ZO optimization instead of using random sampling.
Our results show that our ZO-RL algorithm can effectively reduce the variances of ZO gradient by learning a sampling policy, and converge faster than existing ZO algorithms in different scenarios.
arXiv Detail & Related papers (2021-04-09T14:50:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.