Physics Enhanced Residual Policy Learning (PERPL) for safety cruising in mixed traffic platooning under actuator and communication delay
- URL: http://arxiv.org/abs/2409.15595v1
- Date: Mon, 23 Sep 2024 23:02:34 GMT
- Title: Physics Enhanced Residual Policy Learning (PERPL) for safety cruising in mixed traffic platooning under actuator and communication delay
- Authors: Keke Long, Haotian Shi, Yang Zhou, Xiaopeng Li,
- Abstract summary: Linear control models have gained extensive application in vehicle control due to their simplicity, ease of use, and support for stability analysis.
Reinforcement learning (RL) models, on the other hand, offer adaptability but suffer from a lack of interpretability and generalization capabilities.
This paper aims to develop a family of RL-based controllers enhanced by physics-informed policies.
- Score: 8.172286651098027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Linear control models have gained extensive application in vehicle control due to their simplicity, ease of use, and support for stability analysis. However, these models lack adaptability to the changing environment and multi-objective settings. Reinforcement learning (RL) models, on the other hand, offer adaptability but suffer from a lack of interpretability and generalization capabilities. This paper aims to develop a family of RL-based controllers enhanced by physics-informed policies, leveraging the advantages of both physics-based models (data-efficient and interpretable) and RL methods (flexible to multiple objectives and fast computing). We propose the Physics-Enhanced Residual Policy Learning (PERPL) framework, where the physics component provides model interpretability and stability. The learning-based Residual Policy adjusts the physics-based policy to adapt to the changing environment, thereby refining the decisions of the physics model. We apply our proposed model to decentralized control to mixed traffic platoon of Connected and Automated Vehicles (CAVs) and Human-driven Vehicles (HVs) using a constant time gap (CTG) strategy for cruising and incorporating actuator and communication delays. Experimental results demonstrate that our method achieves smaller headway errors and better oscillation dampening than linear models and RL alone in scenarios with artificially extreme conditions and real preceding vehicle trajectories. At the macroscopic level, overall traffic oscillations are also reduced as the penetration rate of CAVs employing the PERPL scheme increases.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.