Autonomous Platoon Control with Integrated Deep Reinforcement Learning
and Dynamic Programming
- URL: http://arxiv.org/abs/2206.07536v1
- Date: Wed, 15 Jun 2022 13:45:47 GMT
- Title: Autonomous Platoon Control with Integrated Deep Reinforcement Learning
and Dynamic Programming
- Authors: Tong Liu, Lei Lei, Kan Zheng, Kuan Zhang
- Abstract summary: It is more challenging to learn a stable and efficient car-following policy when there are multiple following vehicles in a platoon.
We adopt an integrated DRL and Dynamic Programming approach to learn autonomous platoon control policies.
We propose an algorithm, namely Finite-Horizon-DDPG with Sweeping through reduced state space.
- Score: 12.661547303266252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Reinforcement Learning (DRL) is regarded as a potential method for
car-following control and has been mostly studied to support a single following
vehicle. However, it is more challenging to learn a stable and efficient
car-following policy when there are multiple following vehicles in a platoon,
especially with unpredictable leading vehicle behavior. In this context, we
adopt an integrated DRL and Dynamic Programming (DP) approach to learn
autonomous platoon control policies, which embeds the Deep Deterministic Policy
Gradient (DDPG) algorithm into a finite-horizon value iteration framework.
Although the DP framework can improve the stability and performance of DDPG, it
has the limitations of lower sampling and training efficiency. In this paper,
we propose an algorithm, namely Finite-Horizon-DDPG with Sweeping through
reduced state space using Stationary approximation (FH-DDPG-SS), which uses
three key ideas to overcome the above limitations, i.e., transferring network
weights backward in time, stationary policy approximation for earlier time
steps, and sweeping through reduced state space. In order to verify the
effectiveness of FH-DDPG-SS, simulation using real driving data is performed,
where the performance of FH-DDPG-SS is compared with those of the benchmark
algorithms. Finally, platoon safety and string stability for FH-DDPG-SS are
demonstrated.
Related papers
- Tangled Program Graphs as an alternative to DRL-based control algorithms for UAVs [0.43695508295565777]
Deep reinforcement learning (DRL) is currently the most popular AI-based approach to autonomous vehicle control.
This approach has some significant drawbacks: high computational requirements and low explainability.
We propose to use Tangled Program Graphs (TPGs) as an alternative for DRL in control-related tasks.
arXiv Detail & Related papers (2024-11-08T14:20:29Z) - Autonomous Navigation of Unmanned Vehicle Through Deep Reinforcement Learning [1.3725832537448668]
The paper details the model of a Ackermann robot and the structure and application of the DDPG algorithm.
The results demonstrate that the DDPG algorithm outperforms traditional Deep Q-Network (DQN) and Double Deep Q-Network (DDQN) algorithms in path planning tasks.
arXiv Detail & Related papers (2024-07-18T05:18:59Z) - Vehicles Control: Collision Avoidance using Federated Deep Reinforcement
Learning [3.8078589880662754]
This paper presents a comprehensive study on vehicle control for collision avoidance using Federated Deep Reinforcement Learning techniques.
Our main goal is to minimize travel delays and enhance the average speed of vehicles while prioritizing safety and preserving data privacy.
arXiv Detail & Related papers (2023-08-04T14:26:19Z) - Training Efficient Controllers via Analytic Policy Gradient [44.0762454494769]
Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately.
Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power.
We propose an Analytic Policy Gradient (APG) method to tackle this problem.
arXiv Detail & Related papers (2022-09-26T22:04:35Z) - Dealing with Sparse Rewards in Continuous Control Robotics via
Heavy-Tailed Policies [64.2210390071609]
We present a novel Heavy-Tailed Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems.
We show consistent performance improvement across all tasks in terms of high average cumulative reward.
arXiv Detail & Related papers (2022-06-12T04:09:39Z) - Hybrid Car-Following Strategy based on Deep Deterministic Policy
Gradient and Cooperative Adaptive Cruise Control [7.016756906859412]
A hybrid car-following strategy based on deep deterministic policy gradient (DDPG) and cooperative adaptive cruise control (CACC) is proposed.
The proposed strategy guarantees the basic performance of car-following through CACC, but also makes full use of the advantages of exploration on complex environments via DDPG.
arXiv Detail & Related papers (2021-02-24T17:37:47Z) - Modular Deep Reinforcement Learning for Continuous Motion Planning with
Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP)
The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP.
The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z) - Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms.
D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions.
We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z) - Online Reinforcement Learning Control by Direct Heuristic Dynamic
Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives.
It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise.
We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z) - Optimization-driven Deep Reinforcement Learning for Robust Beamforming
in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver.
We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming.
We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.