Related papers: Large-Scale Traffic Signal Control by a Nash Deep Q-network Approach

Large-Scale Traffic Signal Control by a Nash Deep Q-network Approach

URL: http://arxiv.org/abs/2301.00637v1
Date: Mon, 2 Jan 2023 12:58:51 GMT
Title: Large-Scale Traffic Signal Control by a Nash Deep Q-network Approach
Authors: Yuli.Zhang, Shangbo.Wang, Ruiyuan.Jiang
Abstract summary: We introduce an off-policy nash deep Q-Network (OPNDQN) algorithm, which mitigates the weakness of both fully centralized and MARL approaches. One of main advantages of OPNDQN is to mitigate the non-stationarity of multi-agent Markov process. We show the dominant superiority of OPNDQN over several existing MARL approaches in terms of average queue length, episode training reward and average waiting time.
Score: 7.23135508361981
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement Learning (RL) is currently one of the most commonly used techniques for traffic signal control (TSC), which can adaptively adjusted traffic signal phase and duration according to real-time traffic data. However, a fully centralized RL approach is beset with difficulties in a multi-network scenario because of exponential growth in state-action space with increasing intersections. Multi-agent reinforcement learning (MARL) can overcome the high-dimension problem by employing the global control of each local RL agent, but it also brings new challenges, such as the failure of convergence caused by the non-stationary Markov Decision Process (MDP). In this paper, we introduce an off-policy nash deep Q-Network (OPNDQN) algorithm, which mitigates the weakness of both fully centralized and MARL approaches. The OPNDQN algorithm solves the problem that traditional algorithms cannot be used in large state-action space traffic models by utilizing a fictitious game approach at each iteration to find the nash equilibrium among neighboring intersections, from which no intersection has incentive to unilaterally deviate. One of main advantages of OPNDQN is to mitigate the non-stationarity of multi-agent Markov process because it considers the mutual influence among neighboring intersections by sharing their actions. On the other hand, for training a large traffic network, the convergence rate of OPNDQN is higher than that of existing MARL approaches because it does not incorporate all state information of each agent. We conduct an extensive experiments by using Simulation of Urban MObility simulator (SUMO), and show the dominant superiority of OPNDQN over several existing MARL approaches in terms of average queue length, episode training reward and average waiting time.

Related papers

Smart Traffic Signals: Comparing MARL and Fixed-Time Strategies [0.0]
Urban traffic congestion, particularly at intersections, significantly impacts travel time, fuel consumption, and emissions.<n>Traditional fixed-time signal control systems often lack the adaptability to manage dynamic traffic patterns effectively.<n>This study explores the application of multi-agent reinforcement learning to optimize traffic signal coordination across multiple intersections.
arXiv Detail & Related papers (2025-05-20T15:59:44Z)
Toward Dependency Dynamics in Multi-Agent Reinforcement Learning for Traffic Signal Control [8.312659530314937]
Reinforcement learning (RL) emerges as a promising data-driven approach for adaptive traffic signal control. In this paper, we propose a novel Dynamic Reinforcement Update Strategy for Deep Q-Network (DQN-DPUS) We show that the proposed strategy can speed up the convergence rate without sacrificing optimal exploration.
arXiv Detail & Related papers (2025-02-23T15:29:12Z)
Heterogeneous Multi-Agent Reinforcement Learning for Distributed Channel Access in WLANs [47.600901884970845]
This paper investigates the use of multi-agent reinforcement learning (MARL) to address distributed channel access in wireless local area networks. In particular, we consider the challenging yet more practical case where the agents heterogeneously adopt value-based or policy-based reinforcement learning algorithms to train the model. We propose a heterogeneous MARL training framework, named QPMIX, which adopts a centralized training with distributed execution paradigm to enable heterogeneous agents to collaborate.
arXiv Detail & Related papers (2024-12-18T13:50:31Z)
Improving Traffic Flow Predictions with SGCN-LSTM: A Hybrid Model for Spatial and Temporal Dependencies [55.2480439325792]
This paper introduces the Signal-Enhanced Graph Convolutional Network Long Short Term Memory (SGCN-LSTM) model for predicting traffic speeds across road networks. Experiments on the PEMS-BAY road network traffic dataset demonstrate the SGCN-LSTM model's effectiveness.
arXiv Detail & Related papers (2024-11-01T00:37:00Z)
Adaptive Hierarchical SpatioTemporal Network for Traffic Forecasting [70.66710698485745]
We propose an Adaptive Hierarchical SpatioTemporal Network (AHSTN) to promote traffic forecasting. AHSTN exploits the spatial hierarchy and modeling multi-scale spatial correlations. Experiments on two real-world datasets show that AHSTN achieves better performance over several strong baselines.
arXiv Detail & Related papers (2023-06-15T14:50:27Z)
A Novel Multi-Agent Deep RL Approach for Traffic Signal Control [13.927155702352131]
We propose a Friend-Deep Q-network (Friend-DQN) approach for multiple traffic signal control in urban networks. In particular, the cooperation between multiple agents can reduce the state-action space and thus speed up the convergence.
arXiv Detail & Related papers (2023-06-05T08:20:37Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Decentralized Federated Reinforcement Learning for User-Centric Dynamic TFDD Control [37.54493447920386]
We propose a learning-based dynamic time-frequency division duplexing (D-TFDD) scheme to meet asymmetric and heterogeneous traffic demands. We formulate the problem as a decentralized partially observable Markov decision process (Dec-POMDP) In order to jointly optimize the global resources in a decentralized manner, we propose a federated reinforcement learning (RL) algorithm named Wolpertinger deep deterministic policy gradient (FWDDPG) algorithm.
arXiv Detail & Related papers (2022-11-04T07:39:21Z)
Collaborative Intelligent Reflecting Surface Networks with Multi-Agent Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks. In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z)
A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization [14.455497228170646]
Inefficient traffic signal control methods may cause numerous problems, such as traffic congestion and waste of energy. This paper first proposes a multi-agent deep deterministic policy gradient (MADDPG) method by extending the actor-critic policy gradient algorithms.
arXiv Detail & Related papers (2021-07-13T14:11:04Z)
Low-Latency Federated Learning over Wireless Channels with Differential Privacy [142.5983499872664]
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server. In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement.
arXiv Detail & Related papers (2021-06-20T13:51:18Z)
MetaVIM: Meta Variationally Intrinsic Motivated Reinforcement Learning for Decentralized Traffic Signal Control [54.162449208797334]
Traffic signal control aims to coordinate traffic signals across intersections to improve the traffic efficiency of a district or a city. Deep reinforcement learning (RL) has been applied to traffic signal control recently and demonstrated promising performance where each traffic signal is regarded as an agent. We propose a novel Meta Variationally Intrinsic Motivated (MetaVIM) RL method to learn the decentralized policy for each intersection that considers neighbor information in a latent way.
arXiv Detail & Related papers (2021-01-04T03:06:08Z)
Area-wide traffic signal control based on a deep graph Q-Network (DGQN) trained in an asynchronous manner [3.655021726150368]
Reinforcement learning (RL) algorithms have been widely applied in traffic signal studies. There are, however, several problems in jointly controlling traffic lights for a large transportation network.
arXiv Detail & Related papers (2020-08-05T06:13:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.