Related papers: Autonomous Platoon Control with Integrated Deep Reinforcement Learning and Dynamic Programming

Autonomous Platoon Control with Integrated Deep Reinforcement Learning and Dynamic Programming

URL: http://arxiv.org/abs/2206.07536v1
Date: Wed, 15 Jun 2022 13:45:47 GMT
Title: Autonomous Platoon Control with Integrated Deep Reinforcement Learning and Dynamic Programming
Authors: Tong Liu, Lei Lei, Kan Zheng, Kuan Zhang
Abstract summary: It is more challenging to learn a stable and efficient car-following policy when there are multiple following vehicles in a platoon. We adopt an integrated DRL and Dynamic Programming approach to learn autonomous platoon control policies. We propose an algorithm, namely Finite-Horizon-DDPG with Sweeping through reduced state space.
Score: 12.661547303266252
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Reinforcement Learning (DRL) is regarded as a potential method for car-following control and has been mostly studied to support a single following vehicle. However, it is more challenging to learn a stable and efficient car-following policy when there are multiple following vehicles in a platoon, especially with unpredictable leading vehicle behavior. In this context, we adopt an integrated DRL and Dynamic Programming (DP) approach to learn autonomous platoon control policies, which embeds the Deep Deterministic Policy Gradient (DDPG) algorithm into a finite-horizon value iteration framework. Although the DP framework can improve the stability and performance of DDPG, it has the limitations of lower sampling and training efficiency. In this paper, we propose an algorithm, namely Finite-Horizon-DDPG with Sweeping through reduced state space using Stationary approximation (FH-DDPG-SS), which uses three key ideas to overcome the above limitations, i.e., transferring network weights backward in time, stationary policy approximation for earlier time steps, and sweeping through reduced state space. In order to verify the effectiveness of FH-DDPG-SS, simulation using real driving data is performed, where the performance of FH-DDPG-SS is compared with those of the benchmark algorithms. Finally, platoon safety and string stability for FH-DDPG-SS are demonstrated.

Related papers

Hierarchical Deep Deterministic Policy Gradient for Autonomous Maze Navigation of Mobile Robots [5.834520772858807]
This paper proposes an efficient Hierarchical DDPG (HDDPG) algorithm, which includes high-level and low-level policies.<n>It significantly overcomes the limitations of standard DDPG and its variants, improving the success rate by at least 56.59% and boosting the average reward by a minimum of 519.03.
arXiv Detail & Related papers (2025-08-07T03:06:22Z)
Deep reinforcement learning-based longitudinal control strategy for automated vehicles at signalised intersections [2.9398787168955116]
This study proposes a Deep Reinforcement Learning based longitudinal vehicle control strategy at signalised intersections.<n>A comprehensive reward function has been formulated with a particular focus on distance headway-based efficiency reward.<n>Two popular DRL algorithms, Deep Deterministic Policy Gradient (DDPG) and Soft-Actor Critic (SAC) have been incorporated.
arXiv Detail & Related papers (2025-05-13T18:38:42Z)
Tangled Program Graphs as an alternative to DRL-based control algorithms for UAVs [0.43695508295565777]
Deep reinforcement learning (DRL) is currently the most popular AI-based approach to autonomous vehicle control. This approach has some significant drawbacks: high computational requirements and low explainability. We propose to use Tangled Program Graphs (TPGs) as an alternative for DRL in control-related tasks.
arXiv Detail & Related papers (2024-11-08T14:20:29Z)
Autonomous Navigation of Unmanned Vehicle Through Deep Reinforcement Learning [1.3725832537448668]
The paper details the model of a Ackermann robot and the structure and application of the DDPG algorithm. The results demonstrate that the DDPG algorithm outperforms traditional Deep Q-Network (DQN) and Double Deep Q-Network (DDQN) algorithms in path planning tasks.
arXiv Detail & Related papers (2024-07-18T05:18:59Z)
Vehicles Control: Collision Avoidance using Federated Deep Reinforcement Learning [3.8078589880662754]
This paper presents a comprehensive study on vehicle control for collision avoidance using Federated Deep Reinforcement Learning techniques. Our main goal is to minimize travel delays and enhance the average speed of vehicles while prioritizing safety and preserving data privacy.
arXiv Detail & Related papers (2023-08-04T14:26:19Z)
Training Efficient Controllers via Analytic Policy Gradient [44.0762454494769]
Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately. Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power. We propose an Analytic Policy Gradient (APG) method to tackle this problem.
arXiv Detail & Related papers (2022-09-26T22:04:35Z)
Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies [64.2210390071609]
We present a novel Heavy-Tailed Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems. We show consistent performance improvement across all tasks in terms of high average cumulative reward.
arXiv Detail & Related papers (2022-06-12T04:09:39Z)
Hybrid Car-Following Strategy based on Deep Deterministic Policy Gradient and Cooperative Adaptive Cruise Control [7.016756906859412]
A hybrid car-following strategy based on deep deterministic policy gradient (DDPG) and cooperative adaptive cruise control (CACC) is proposed. The proposed strategy guarantees the basic performance of car-following through CACC, but also makes full use of the advantages of exploration on complex environments via DDPG.
arXiv Detail & Related papers (2021-02-24T17:37:47Z)
Modular Deep Reinforcement Learning for Continuous Motion Planning with Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP) The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP. The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z)
Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms. D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions. We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z)
Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives. It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise. We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z)
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO) We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.