Proximal Policy Optimization Learning based Control of Congested Freeway
Traffic
- URL: http://arxiv.org/abs/2204.05627v1
- Date: Tue, 12 Apr 2022 08:36:21 GMT
- Title: Proximal Policy Optimization Learning based Control of Congested Freeway
Traffic
- Authors: Shurong Mo, Jie Qi, Anqi Pan
- Abstract summary: This study proposes a delay-compensated feedback controller based on proximal policy optimization (PPO) reinforcement learning.
For a delay-free system, the PPO control has faster convergence rate and less control effort than the Lyapunov control.
- Score: 3.816579519746557
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study proposes a delay-compensated feedback controller based on proximal
policy optimization (PPO) reinforcement learning to stabilize traffic flow in
the congested regime by manipulating the time-gap of adaptive cruise
control-equipped (ACC-equipped) vehicles.The traffic dynamics on a freeway
segment are governed by an Aw-Rascle-Zhang (ARZ) model, consisting of $2\times
2$ nonlinear first-order partial differential equations (PDEs).Inspired by the
backstepping delay compensator [18] but different from whose complex segmented
control scheme, the PPO control is composed of three feedbacks, namely the
current traffic flow velocity, the current traffic flow density and previous
one step control input. The control gains for the three feedbacks are learned
from the interaction between the PPO and the numerical simulator of the traffic
system without knowing the system dynamics. Numerical simulation experiments
are designed to compare the Lyapunov control, the backstepping control and the
PPO control. The results show that for a delay-free system, the PPO control has
faster convergence rate and less control effort than the Lyapunov control. For
a traffic system with input delay, the performance of the PPO controller is
comparable to that of the Backstepping controller, even for the situation that
the delay value does not match. However, the PPO is robust to parameter
perturbations, while the Backstepping controller cannot stabilize a system
where one of the parameters is disturbed by Gaussian noise.
Related papers
- Resource Optimization for Tail-Based Control in Wireless Networked Control Systems [31.144888314890597]
Achieving control stability is one of the key design challenges of scalable Wireless Networked Control Systems.
This paper explores the use of an alternative control concept defined as tail-based control, which extends the classical Linear Quadratic Regulator (LQR) cost function for multiple dynamic control systems over a shared wireless network.
arXiv Detail & Related papers (2024-06-20T13:27:44Z) - Improving a Proportional Integral Controller with Reinforcement Learning on a Throttle Valve Benchmark [2.8322124733515666]
This paper presents a learning-based control strategy for non-linear throttle valves with an asymmetric controller.
We exploit the recent advances in Reinforcement Learning with Guides to improve the closed-loop behavior by learning from the additional interactions with the valve.
In all the experimental test cases, the resulting agent has a better sample efficiency than traditional RL agents and outperforms the PI controller.
arXiv Detail & Related papers (2024-02-21T09:40:26Z) - Neural Operators for Boundary Stabilization of Stop-and-go Traffic [1.90298817989995]
This paper introduces a novel approach to PDE boundary control design using neural operators.
We present two distinct neural operator learning schemes aimed at stabilizing the traffic PDE system.
It is proved that the NO-based closed-loop system is practical stable under certain approximation accuracy conditions in NO-learning.
arXiv Detail & Related papers (2023-12-16T08:18:39Z) - DATT: Deep Adaptive Trajectory Tracking for Quadrotor Control [62.24301794794304]
Deep Adaptive Trajectory Tracking (DATT) is a learning-based approach that can precisely track arbitrary, potentially infeasible trajectories in the presence of large disturbances in the real world.
DATT significantly outperforms competitive adaptive nonlinear and model predictive controllers for both feasible smooth and infeasible trajectories in unsteady wind fields.
It can efficiently run online with an inference time less than 3.2 ms, less than 1/4 of the adaptive nonlinear model predictive control baseline.
arXiv Detail & Related papers (2023-10-13T12:22:31Z) - A GOA-Based Fault-Tolerant Trajectory Tracking Control for an Underwater
Vehicle of Multi-Thruster System without Actuator Saturation [9.371458775465825]
This paper proposes an intelligent fault-tolerant control (FTC) strategy to tackle the trajectory tracking problem of an underwater vehicle (UV) under thruster damage (power loss) cases.
In the proposed control strategy, the trajectory tracking component is formed by a refined backstepping algorithm that controls the velocity variation and a sliding mode control deducts the torque/force outputs.
arXiv Detail & Related papers (2023-01-04T21:30:16Z) - Development of a CAV-based Intersection Control System and Corridor
Level Impact Assessment [0.696125353550498]
This paper presents a signal-free intersection control system for CAVs by combination of a pixel reservation algorithm and a Deep Reinforcement Learning (DRL) decision-making logic.
The proposed model reduces delay by 50%, 29%, and 23% in moderate, high, and extreme volume regimes compared to the other CAV-based control system.
arXiv Detail & Related papers (2022-08-21T21:56:20Z) - Comparative analysis of machine learning methods for active flow control [60.53767050487434]
Genetic Programming (GP) and Reinforcement Learning (RL) are gaining popularity in flow control.
This work presents a comparative analysis of the two, bench-marking some of their most representative algorithms against global optimization techniques.
arXiv Detail & Related papers (2022-02-23T18:11:19Z) - Regret-optimal Estimation and Control [52.28457815067461]
We show that the regret-optimal estimator and regret-optimal controller can be derived in state-space form.
We propose regret-optimal analogs of Model-Predictive Control (MPC) and the Extended KalmanFilter (EKF) for systems with nonlinear dynamics.
arXiv Detail & Related papers (2021-06-22T23:14:21Z) - Federated Learning on the Road: Autonomous Controller Design for
Connected and Autonomous Vehicles [109.71532364079711]
A new federated learning (FL) framework is proposed for designing the autonomous controller of connected and autonomous vehicles (CAVs)
A novel dynamic federated proximal (DFP) algorithm is proposed that accounts for the mobility of CAVs, the wireless fading channels, and the unbalanced and nonindependent and identically distributed data across CAVs.
A rigorous convergence analysis is performed for the proposed algorithm to identify how fast the CAVs converge to using the optimal controller.
arXiv Detail & Related papers (2021-02-05T19:57:47Z) - Optimizing Mixed Autonomy Traffic Flow With Decentralized Autonomous
Vehicles and Multi-Agent RL [63.52264764099532]
We study the ability of autonomous vehicles to improve the throughput of a bottleneck using a fully decentralized control scheme in a mixed autonomy setting.
We apply multi-agent reinforcement algorithms to this problem and demonstrate that significant improvements in bottleneck throughput, from 20% at a 5% penetration rate to 33% at a 40% penetration rate, can be achieved.
arXiv Detail & Related papers (2020-10-30T22:06:05Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.