Bayesian Critique-Tune-Based Reinforcement Learning with Adaptive Pressure for Multi-Intersection Traffic Signal Control
- URL: http://arxiv.org/abs/2412.16225v2
- Date: Wed, 25 Dec 2024 08:24:00 GMT
- Title: Bayesian Critique-Tune-Based Reinforcement Learning with Adaptive Pressure for Multi-Intersection Traffic Signal Control
- Authors: Wenchang Duan, Zhenguo Gao, Jiwan He, Jinguo Xian,
- Abstract summary: This paper proposes a novel Critique-Tune-Based Reinforcement Learning with Adaptive Pressure for multi-intersection signal control (BCT-APLight)
BCT-APLight is superior to other state-of-the-art (SOTA) methods on seven real-world datasets.
- Score: 0.5399800035598185
- License:
- Abstract: Adaptive Traffic Signal Control (ATSC) system is a critical component of intelligent transportation, with the capability to significantly alleviate urban traffic congestion. Although reinforcement learning (RL)-based methods have demonstrated promising performance in achieving ATSC, existing methods are still prone to making unreasonable policies. Therefore, this paper proposes a novel Bayesian Critique-Tune-Based Reinforcement Learning with Adaptive Pressure for multi-intersection signal control (BCT-APLight). In BCT-APLight, the Critique-Tune (CT) framework, a two-layer Bayesian structure is designed to refine the excessive trust of RL policies. Specifically, the Bayesian inference-based Critique Layer provides effective evaluations of the credibility of policies; the Bayesian decision-based Tune Layer fine-tunes policies by minimizing the posterior risks when the evaluations are negative. Meanwhile, an attention-based Adaptive Pressure (AP) mechanism is designed to effectively weight the vehicle queues in each lane, thereby enhancing the rationality of traffic movement representation within the network. Equipped with the CT framework and AP mechanism, BCT-APLight effectively enhances the reasonableness of RL policies. Extensive experiments conducted with a simulator across a range of intersection layouts demonstrate that BCT-APLight is superior to other state-of-the-art (SOTA) methods on seven real-world datasets. Specifically, BCT-APLight decreases average queue length by \textbf{\(\boldsymbol{9.60\%}\)} and average waiting time by \textbf{\(\boldsymbol{15.28\%}\)}.
Related papers
- Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport [45.793758222754036]
Diffusion policies have shown promise in learning complex behaviors from demonstrations.
This paper explores improving diffusion-based imitation learning models through online interactions with the environment.
We propose OTPR, a novel method that integrates diffusion policies with RL using optimal transport theory.
arXiv Detail & Related papers (2025-02-18T08:22:20Z) - Learning to Sail Dynamic Networks: The MARLIN Reinforcement Learning
Framework for Congestion Control in Tactical Environments [53.08686495706487]
This paper proposes an RL framework that leverages an accurate and parallelizable emulation environment to reenact the conditions of a tactical network.
We evaluate our RL learning framework by training a MARLIN agent in conditions replicating a bottleneck link transition between a Satellite Communication (SATCOM) and an UHF Wide Band (UHF) radio link.
arXiv Detail & Related papers (2023-06-27T16:15:15Z) - DenseLight: Efficient Control for Large-scale Traffic Signals with Dense
Feedback [109.84667902348498]
Traffic Signal Control (TSC) aims to reduce the average travel time of vehicles in a road network.
Most prior TSC methods leverage deep reinforcement learning to search for a control policy.
We propose DenseLight, a novel RL-based TSC method that employs an unbiased reward function to provide dense feedback on policy effectiveness.
arXiv Detail & Related papers (2023-06-13T05:58:57Z) - Lyapunov Function Consistent Adaptive Network Signal Control with Back
Pressure and Reinforcement Learning [9.797994846439527]
This study introduces a unified framework using Lyapunov control theory, defining specific Lyapunov functions respectively.
Building on insights from Lyapunov theory, this study designs a reward function for the Reinforcement Learning (RL)-based network signal control.
The proposed algorithm is compared with several traditional and RL-based methods under pure passenger car flow and heterogenous traffic flow including freight.
arXiv Detail & Related papers (2022-10-06T00:22:02Z) - Efficient Pressure: Improving efficiency for signalized intersections [24.917612761503996]
Reinforcement learning (RL) has attracted more attention to help solve the traffic signal control (TSC) problem.
Existing RL-based methods are rarely deployed considering that they are neither cost-effective in terms of computing resources nor more robust than traditional approaches.
We demonstrate how to construct an adaptive controller for TSC with less training and reduced complexity based on RL-based approach.
arXiv Detail & Related papers (2021-12-04T13:49:58Z) - Layer Pruning on Demand with Intermediate CTC [50.509073206630994]
We present a training and pruning method for ASR based on the connectionist temporal classification (CTC)
We show that a Transformer-CTC model can be pruned in various depth on demand, improving real-time factor from 0.005 to 0.002 on GPU.
arXiv Detail & Related papers (2021-06-17T02:40:18Z) - AdaPool: A Diurnal-Adaptive Fleet Management Framework using Model-Free
Deep Reinforcement Learning and Change Point Detection [34.77250498401055]
This paper introduces an adaptive model-free deep reinforcement approach that can recognize and adapt to the diurnal patterns in the ride-sharing environment with car-pooling.
In addition to the adaptation logic in dispatching, this paper also proposes a dynamic, demand-aware vehicle-passenger matching and route planning framework.
arXiv Detail & Related papers (2021-04-01T02:14:01Z) - Federated Learning on the Road: Autonomous Controller Design for
Connected and Autonomous Vehicles [109.71532364079711]
A new federated learning (FL) framework is proposed for designing the autonomous controller of connected and autonomous vehicles (CAVs)
A novel dynamic federated proximal (DFP) algorithm is proposed that accounts for the mobility of CAVs, the wireless fading channels, and the unbalanced and nonindependent and identically distributed data across CAVs.
A rigorous convergence analysis is performed for the proposed algorithm to identify how fast the CAVs converge to using the optimal controller.
arXiv Detail & Related papers (2021-02-05T19:57:47Z) - Optimization-driven Deep Reinforcement Learning for Robust Beamforming
in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver.
We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming.
We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z) - Non-recurrent Traffic Congestion Detection with a Coupled Scalable
Bayesian Robust Tensor Factorization Model [5.141309607968161]
Non-recurrent traffic congestion (NRTC) usually brings unexpected delays to commuters.
It is critical to accurately detect and recognize the NRTC in a real-time manner.
arXiv Detail & Related papers (2020-05-10T03:58:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.