Multi-Timescale Hierarchical Reinforcement Learning for Unified Behavior and Control of Autonomous Driving
- URL: http://arxiv.org/abs/2506.23771v1
- Date: Mon, 30 Jun 2025 12:17:42 GMT
- Title: Multi-Timescale Hierarchical Reinforcement Learning for Unified Behavior and Control of Autonomous Driving
- Authors: Guizhe Jin, Zhuoren Li, Bo Leng, Ran Yu, Lu Xiong,
- Abstract summary: We propose a multi-timescale hierarchical reinforcement learning approach for autonomous driving.<n>High- and low-level RL policies are unified-trained to produce long-timescale motion guidance and short-timescale control commands.<n>Our approach significantly improves AD performance, effectively increasing driving efficiency, action consistency and safety.
- Score: 4.750705843012836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement Learning (RL) is increasingly used in autonomous driving (AD) and shows clear advantages. However, most RL-based AD methods overlook policy structure design. An RL policy that only outputs short-timescale vehicle control commands results in fluctuating driving behavior due to fluctuations in network outputs, while one that only outputs long-timescale driving goals cannot achieve unified optimality of driving behavior and control. Therefore, we propose a multi-timescale hierarchical reinforcement learning approach. Our approach adopts a hierarchical policy structure, where high- and low-level RL policies are unified-trained to produce long-timescale motion guidance and short-timescale control commands, respectively. Therein, motion guidance is explicitly represented by hybrid actions to capture multimodal driving behaviors on structured road and support incremental low-level extend-state updates. Additionally, a hierarchical safety mechanism is designed to ensure multi-timescale safety. Evaluation in simulator-based and HighD dataset-based highway multi-lane scenarios demonstrates that our approach significantly improves AD performance, effectively increasing driving efficiency, action consistency and safety.
Related papers
- Action Space Reduction Strategies for Reinforcement Learning in Autonomous Driving [0.0]
Reinforcement Learning (RL) offers a promising framework for autonomous driving.<n>Large and high-dimensional action spaces often used to support fine-grained control can impede training efficiency and increase exploration costs.<n>We introduce and evaluate two novel structured action space modification strategies for RL in autonomous driving.
arXiv Detail & Related papers (2025-07-07T17:58:08Z) - TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning [61.33599727106222]
TeLL-Drive is a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy.<n>A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness.
arXiv Detail & Related papers (2025-02-03T14:22:03Z) - Hybrid Action Based Reinforcement Learning for Multi-Objective Compatible Autonomous Driving [9.39122455540358]
Reinforcement Learning (RL) has shown excellent performance in solving decision-making and control problems of autonomous driving.<n>Driving is a multi-attribute problem, leading to challenges in achieving multi-objective compatibility for current RL methods.<n>We propose a Multi-objective Ensemble-Critic reinforcement learning method with Hybrid Parametrized Action for multi-objective compatible autonomous driving.
arXiv Detail & Related papers (2025-01-14T13:10:13Z) - From Imitation to Exploration: End-to-end Autonomous Driving based on World Model [24.578178308010912]
RAMBLE is an end-to-end world model-based RL method for driving decision-making.<n>It can handle complex and dynamic traffic scenarios.<n>It achieves state-of-the-art performance in route completion rate on the CARLA Leaderboard 1.0 and completes all 38 scenarios on the CARLA Leaderboard 2.0.
arXiv Detail & Related papers (2024-10-03T06:45:59Z) - AD-H: Autonomous Driving with Hierarchical Agents [64.49185157446297]
We propose to connect high-level instructions and low-level control signals with mid-level language-driven commands.
We implement this idea through a hierarchical multi-agent driving system named AD-H.
arXiv Detail & Related papers (2024-06-05T17:25:46Z) - Generalizing Cooperative Eco-driving via Multi-residual Task Learning [6.864745785996583]
Multi-residual Task Learning (MRTL) is a generic learning framework based on multi-task learning.
MRTL decomposes control into nominal components that are effectively solved by conventional control methods and residual terms.
We employ MRTL for fleet-level emission reduction in mixed traffic using autonomous vehicles as a means of system control.
arXiv Detail & Related papers (2024-03-07T05:25:34Z) - Empowering Autonomous Driving with Large Language Models: A Safety Perspective [82.90376711290808]
This paper explores the integration of Large Language Models (LLMs) into Autonomous Driving systems.
LLMs are intelligent decision-makers in behavioral planning, augmented with a safety verifier shield for contextual safety learning.
We present two key studies in a simulated environment: an adaptive LLM-conditioned Model Predictive Control (MPC) and an LLM-enabled interactive behavior planning scheme with a state machine.
arXiv Detail & Related papers (2023-11-28T03:13:09Z) - Bi-Level Optimization Augmented with Conditional Variational Autoencoder
for Autonomous Driving in Dense Traffic [0.9281671380673306]
This paper presents a parameterized bi-level optimization that jointly computes the optimal behavioural decisions and the resulting trajectory.
Our approach runs in real-time using a custom GPU-accelerated batch, and a Variational Autoencoder learnt warm-start strategy.
Our approach outperforms state-of-the-art model predictive control and RL approaches in terms of collision rate while being competitive in driving efficiency.
arXiv Detail & Related papers (2022-12-05T12:56:42Z) - Unified Automatic Control of Vehicular Systems with Reinforcement
Learning [64.63619662693068]
This article contributes a streamlined methodology for vehicular microsimulation.
It discovers high performance control strategies with minimal manual design.
The study reveals numerous emergent behaviors resembling wave mitigation, traffic signaling, and ramp metering.
arXiv Detail & Related papers (2022-07-30T16:23:45Z) - Online Reinforcement Learning Control by Direct Heuristic Dynamic
Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives.
It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise.
We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.