Trajectory Planning for Autonomous Vehicles Using Hierarchical
Reinforcement Learning
- URL: http://arxiv.org/abs/2011.04752v1
- Date: Mon, 9 Nov 2020 20:49:54 GMT
- Title: Trajectory Planning for Autonomous Vehicles Using Hierarchical
Reinforcement Learning
- Authors: Kaleb Ben Naveed, Zhiqian Qiao and John M. Dolan
- Abstract summary: Planning safe trajectories under uncertain and dynamic conditions makes the autonomous driving problem significantly complex.
Current sampling-based methods such as Rapidly Exploring Random Trees (RRTs) are not ideal for this problem because of the high computational cost.
We propose a Hierarchical Reinforcement Learning structure combined with a Proportional-Integral-Derivative (PID) controller for trajectory planning.
- Score: 21.500697097095408
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Planning safe trajectories under uncertain and dynamic conditions makes the
autonomous driving problem significantly complex. Current sampling-based
methods such as Rapidly Exploring Random Trees (RRTs) are not ideal for this
problem because of the high computational cost. Supervised learning methods
such as Imitation Learning lack generalization and safety guarantees. To
address these problems and in order to ensure a robust framework, we propose a
Hierarchical Reinforcement Learning (HRL) structure combined with a
Proportional-Integral-Derivative (PID) controller for trajectory planning. HRL
helps divide the task of autonomous vehicle driving into sub-goals and supports
the network to learn policies for both high-level options and low-level
trajectory planner choices. The introduction of sub-goals decreases convergence
time and enables the policies learned to be reused for other scenarios. In
addition, the proposed planner is made robust by guaranteeing smooth
trajectories and by handling the noisy perception system of the ego-car. The
PID controller is used for tracking the waypoints, which ensures smooth
trajectories and reduces jerk. The problem of incomplete observations is
handled by using a Long-Short-Term-Memory (LSTM) layer in the network. Results
from the high-fidelity CARLA simulator indicate that the proposed method
reduces convergence time, generates smoother trajectories, and is able to
handle dynamic surroundings and noisy observations.
Related papers
- TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning [61.33599727106222]
TeLL-Drive is a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy.
A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness.
arXiv Detail & Related papers (2025-02-03T14:22:03Z) - Monte Carlo Tree Search with Velocity Obstacles for safe and efficient motion planning in dynamic environments [49.30744329170107]
We propose a novel approach for optimal online motion planning with minimal information about dynamic obstacles.
The proposed methodology combines Monte Carlo Tree Search (MCTS), for online optimal planning via model simulations, with Velocity Obstacles (VO), for obstacle avoidance.
We show the superiority of our methodology with respect to state-of-the-art planners, including Non-linear Model Predictive Control (NMPC), in terms of improved collision rate, computational and task performance.
arXiv Detail & Related papers (2025-01-16T16:45:08Z) - SCoTT: Wireless-Aware Path Planning with Vision Language Models and Strategic Chains-of-Thought [78.53885607559958]
A novel approach using vision language models (VLMs) is proposed for enabling path planning in complex wireless-aware environments.
To this end, insights from a digital twin with real-world wireless ray tracing data are explored.
Results show that SCoTT achieves very close average path gains compared to DP-WA* while at the same time yielding consistently shorter path lengths.
arXiv Detail & Related papers (2024-11-27T10:45:49Z) - Residual Chain Prediction for Autonomous Driving Path Planning [5.139918355140954]
Residual Chain Loss dynamically adjusts the loss calculation process to enhance the temporal dependency and accuracy of predicted path points.
Our findings highlight the potential of Residual Chain Loss to revolutionize planning component of autonomous driving systems.
arXiv Detail & Related papers (2024-04-08T11:43:40Z) - Integrating DeepRL with Robust Low-Level Control in Robotic Manipulators for Non-Repetitive Reaching Tasks [0.24578723416255746]
In robotics, contemporary strategies are learning-based, characterized by a complex black-box nature and a lack of interpretability.
We propose integrating a collision-free trajectory planner based on deep reinforcement learning (DRL) with a novel auto-tuning low-level control strategy.
arXiv Detail & Related papers (2024-02-04T15:54:03Z) - Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing [0.0]
This paper addresses the issue of increasing the performance of reinforcement learning (RL) solutions for autonomous racing cars.
We propose a partial end-to-end algorithm that decouples the planning and control tasks.
By leveraging the robustness of a classical controller, our partial end-to-end driving algorithm exhibits better robustness towards model mismatches than standard end-to-end algorithms.
arXiv Detail & Related papers (2023-12-11T14:27:10Z) - Tackling Real-World Autonomous Driving using Deep Reinforcement Learning [63.3756530844707]
In this work, we propose a model-free Deep Reinforcement Learning Planner training a neural network that predicts acceleration and steering angle.
In order to deploy the system on board the real self-driving car, we also develop a module represented by a tiny neural network.
arXiv Detail & Related papers (2022-07-05T16:33:20Z) - Imitation Learning for Robust and Safe Real-time Motion Planning: A
Contraction Theory Approach [9.35511513240868]
LAG-ROS is a real-time robust motion planning algorithm for safety-critical nonlinear systems perturbed by bounded disturbances.
The LAG-ROS achieves higher control performance and task success rate with faster execution speed for real-time computation.
arXiv Detail & Related papers (2021-02-25T03:47:15Z) - A Safe Hierarchical Planning Framework for Complex Driving Scenarios
based on Reinforcement Learning [23.007323699176467]
We propose a hierarchical behavior planning framework with a set of low-level safe controllers and a high-level reinforcement learning algorithm (H-CtRL) as a coordinator for the low-level controllers.
Safety is guaranteed by the low-level optimization/sampling-based controllers, while the high-level reinforcement learning algorithm makes H-CtRL an adaptive and efficient behavior planner.
The proposed H-CtRL is proved to be effective in various realistic simulation scenarios, with satisfying performance in terms of both safety and efficiency.
arXiv Detail & Related papers (2021-01-17T20:45:42Z) - Reinforcement Learning for Low-Thrust Trajectory Design of
Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances.
An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted.
The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.