Related papers: Hybrid Car-Following Strategy based on Deep Deterministic Policy Gradient and Cooperative Adaptive Cruise Control

Hybrid Car-Following Strategy based on Deep Deterministic Policy Gradient and Cooperative Adaptive Cruise Control

URL: http://arxiv.org/abs/2103.03796v1
Date: Wed, 24 Feb 2021 17:37:47 GMT
Title: Hybrid Car-Following Strategy based on Deep Deterministic Policy Gradient and Cooperative Adaptive Cruise Control
Authors: Ruidong Yan, Rui Jiang, Bin Jia, Diange Yang, and Jin Huang
Abstract summary: A hybrid car-following strategy based on deep deterministic policy gradient (DDPG) and cooperative adaptive cruise control (CACC) is proposed. The proposed strategy guarantees the basic performance of car-following through CACC, but also makes full use of the advantages of exploration on complex environments via DDPG.
Score: 7.016756906859412
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep deterministic policy gradient (DDPG) based car-following strategy can break through the constraints of the differential equation model due to the ability of exploration on complex environments. However, the car-following performance of DDPG is usually degraded by unreasonable reward function design, insufficient training and low sampling efficiency. In order to solve this kind of problem, a hybrid car-following strategy based on DDPG and cooperative adaptive cruise control (CACC) is proposed. Firstly, the car-following process is modeled as markov decision process to calculate CACC and DDPG simultaneously at each frame. Given a current state, two actions are obtained from CACC and DDPG, respectively. Then an optimal action, corresponding to the one offering a larger reward, is chosen as the output of the hybrid strategy. Meanwhile, a rule is designed to ensure that the change rate of acceleration is smaller than the desired value. Therefore, the proposed strategy not only guarantees the basic performance of car-following through CACC, but also makes full use of the advantages of exploration on complex environments via DDPG. Finally, simulation results show that the car-following performance of proposed strategy is improved significantly as compared with that of DDPG and CACC in the whole state space.

Related papers

Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning [62.81324245896717]
We introduce an exploration-agnostic algorithm, called C-PG, which exhibits global last-ite convergence guarantees under (weak) gradient domination assumptions. We numerically validate our algorithms on constrained control problems, and compare them with state-of-the-art baselines.
arXiv Detail & Related papers (2024-07-15T14:54:57Z)
Adaptive Kalman-based hybrid car following strategy using TD3 and CACC [5.052960220478617]
In autonomous driving, the hybrid strategy of deep reinforcement learning and cooperative adaptive cruise control (CACC) can significantly improve the performance of car following. It is challenging for the traditional hybrid strategy based on fixed coefficients to adapt to mixed traffic flow scenarios. A hybrid car following strategy based on an adaptive Kalman Filter is proposed by regarding CACC and Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithms.
arXiv Detail & Related papers (2023-12-26T10:51:46Z)
Vehicles Control: Collision Avoidance using Federated Deep Reinforcement Learning [3.8078589880662754]
This paper presents a comprehensive study on vehicle control for collision avoidance using Federated Deep Reinforcement Learning techniques. Our main goal is to minimize travel delays and enhance the average speed of vehicles while prioritizing safety and preserving data privacy.
arXiv Detail & Related papers (2023-08-04T14:26:19Z)
Bi-Level Optimization Augmented with Conditional Variational Autoencoder for Autonomous Driving in Dense Traffic [0.9281671380673306]
This paper presents a parameterized bi-level optimization that jointly computes the optimal behavioural decisions and the resulting trajectory. Our approach runs in real-time using a custom GPU-accelerated batch, and a Variational Autoencoder learnt warm-start strategy. Our approach outperforms state-of-the-art model predictive control and RL approaches in terms of collision rate while being competitive in driving efficiency.
arXiv Detail & Related papers (2022-12-05T12:56:42Z)
Integrated Decision and Control for High-Level Automated Vehicles by Mixed Policy Gradient and Its Experiment Verification [10.393343763237452]
This paper presents a self-evolving decision-making system based on the Integrated Decision and Control (IDC) An RL algorithm called constrained mixed policy gradient (CMPG) is proposed to consistently upgrade the driving policy of the IDC. Experiment results show that boosting by data, the system can achieve better driving ability over model-based methods.
arXiv Detail & Related papers (2022-10-19T14:58:41Z)
Actor-Critic based Improper Reinforcement Learning [61.430513757337486]
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process. We propose two algorithms: (1) a Policy Gradient-based approach; and (2) an algorithm that can switch between a simple Actor-Critic scheme and a Natural Actor-Critic scheme.
arXiv Detail & Related papers (2022-07-19T05:55:02Z)
Autonomous Platoon Control with Integrated Deep Reinforcement Learning and Dynamic Programming [12.661547303266252]
It is more challenging to learn a stable and efficient car-following policy when there are multiple following vehicles in a platoon. We adopt an integrated DRL and Dynamic Programming approach to learn autonomous platoon control policies. We propose an algorithm, namely Finite-Horizon-DDPG with Sweeping through reduced state space.
arXiv Detail & Related papers (2022-06-15T13:45:47Z)
When AUC meets DRO: Optimizing Partial AUC for Deep Learning with Non-Convex Convergence Guarantee [51.527543027813344]
We propose systematic and efficient gradient-based methods for both one-way and two-way partial AUC (pAUC) For both one-way and two-way pAUC, we propose two algorithms and prove their convergence for optimizing their two formulations, respectively.
arXiv Detail & Related papers (2022-03-01T01:59:53Z)
Zeroth-order Deterministic Policy Gradient [116.87117204825105]
We introduce Zeroth-order Deterministic Policy Gradient (ZDPG) ZDPG approximates policy-reward gradients via two-point evaluations of the $Q$function. New finite sample complexity bounds for ZDPG improve upon existing results by up to two orders of magnitude.
arXiv Detail & Related papers (2020-06-12T16:52:29Z)
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
Mixed Strategies for Robust Optimization of Unknown Objectives [93.8672371143881]
We consider robust optimization problems, where the goal is to optimize an unknown objective function against the worst-case realization of an uncertain parameter. We design a novel sample-efficient algorithm GP-MRO, which sequentially learns about the unknown objective from noisy point evaluations. GP-MRO seeks to discover a robust and randomized mixed strategy, that maximizes the worst-case expected objective value.
arXiv Detail & Related papers (2020-02-28T09:28:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.