MacLight: Multi-scene Aggregation Convolutional Learning for Traffic Signal Control
- URL: http://arxiv.org/abs/2412.15703v3
- Date: Tue, 24 Dec 2024 04:42:00 GMT
- Title: MacLight: Multi-scene Aggregation Convolutional Learning for Traffic Signal Control
- Authors: Sunbowen Lee, Hongqin Lyu, Yicheng Gong, Yingying Sun, Chao Deng,
- Abstract summary: Reinforcement learning methods have proposed promising traffic signal control policy that can be trained on large road networks.<n>Current SOTA methods model road networks as topological graph structures, incorporate graph attention into deep Q-learning, and merge local and global embeddings to improve policy.<n>We propose Multi-Scene Aggregation Convolutional Learning for traffic signal control (MacLight), which offers faster training speeds and more stable performance.
- Score: 8.342432309172757
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning methods have proposed promising traffic signal control policy that can be trained on large road networks. Current SOTA methods model road networks as topological graph structures, incorporate graph attention into deep Q-learning, and merge local and global embeddings to improve policy. However, graph-based methods are difficult to parallelize, resulting in huge time overhead. Moreover, none of the current peer studies have deployed dynamic traffic systems for experiments, which is far from the actual situation. In this context, we propose Multi-Scene Aggregation Convolutional Learning for traffic signal control (MacLight), which offers faster training speeds and more stable performance. Our approach consists of two main components. The first is the global representation, where we utilize variational autoencoders to compactly compress and extract the global representation. The second component employs the proximal policy optimization algorithm as the backbone, allowing value evaluation to consider both local features and global embedding representations. This backbone model significantly reduces time overhead and ensures stability in policy updates. We validated our method across multiple traffic scenarios under both static and dynamic traffic systems. Experimental results demonstrate that, compared to general and domian SOTA methods, our approach achieves superior stability, optimized convergence levels and the highest time efficiency. The code is under https://github.com/Aegis1863/MacLight.
Related papers
- Enhancing Traffic Signal Control through Model-based Reinforcement Learning and Policy Reuse [0.9995933996287355]
Multi-agent reinforcement learning (MARL) has shown significant potential in traffic signal control (TSC)
Current MARL-based methods often suffer from insufficient generalization due to the fixed traffic patterns and road network conditions used during training.
This limitation results in poor adaptability to new traffic scenarios, leading to high retraining costs and complex deployment.
We propose two algorithms: PLight and PRLight. PLight employs a model-based reinforcement learning approach, pretraining control policies and environment models using predefined source-domain traffic scenarios. PRLight further enhances adaptability by adaptively selecting pre-trained PLight agents based on the similarity between
arXiv Detail & Related papers (2025-03-11T01:21:13Z) - From Imitation to Exploration: End-to-end Autonomous Driving based on World Model [24.578178308010912]
RAMBLE is an end-to-end world model-based RL method for driving decision-making.
It can handle complex and dynamic traffic scenarios.
It achieves state-of-the-art performance in route completion rate on the CARLA Leaderboard 1.0 and completes all 38 scenarios on the CARLA Leaderboard 2.0.
arXiv Detail & Related papers (2024-10-03T06:45:59Z) - Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Cooperative Multi-Objective Reinforcement Learning for Traffic Signal
Control and Carbon Emission Reduction [3.3454373538792552]
We propose a cooperative multi-objective architecture called Multi-Objective Multi-Agent Deep Deterministic Policy Gradient.
MOMA-DDPG estimates multiple reward terms for traffic signal control optimization using age-decaying weights.
Our results demonstrate the effectiveness of MOMA-DDPG, outperforming state-of-the-art methods across all performance metrics.
arXiv Detail & Related papers (2023-06-16T07:37:05Z) - Improving the generalizability and robustness of large-scale traffic
signal control [3.8028221877086814]
We study the robustness of deep reinforcement-learning (RL) approaches to control traffic signals.
We show that recent methods remain brittle in the face of missing data.
We propose using a combination of distributional and vanilla reinforcement learning through a policy ensemble.
arXiv Detail & Related papers (2023-06-02T21:30:44Z) - Enforcing the consensus between Trajectory Optimization and Policy
Learning for precise robot control [75.28441662678394]
Reinforcement learning (RL) and trajectory optimization (TO) present strong complementary advantages.
We propose several improvements on top of these approaches to learn global control policies quicker.
arXiv Detail & Related papers (2022-09-19T13:32:09Z) - Learning Optimal Antenna Tilt Control Policies: A Contextual Linear
Bandit Approach [65.27783264330711]
Controlling antenna tilts in cellular networks is imperative to reach an efficient trade-off between network coverage and capacity.
We devise algorithms learning optimal tilt control policies from existing data.
We show that they can produce optimal tilt update policy using much fewer data samples than naive or existing rule-based learning algorithms.
arXiv Detail & Related papers (2022-01-06T18:24:30Z) - End-to-End Intersection Handling using Multi-Agent Deep Reinforcement
Learning [63.56464608571663]
Navigating through intersections is one of the main challenging tasks for an autonomous vehicle.
In this work, we focus on the implementation of a system able to navigate through intersections where only traffic signs are provided.
We propose a multi-agent system using a continuous, model-free Deep Reinforcement Learning algorithm used to train a neural network for predicting both the acceleration and the steering angle at each time step.
arXiv Detail & Related papers (2021-04-28T07:54:40Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z) - MetaVIM: Meta Variationally Intrinsic Motivated Reinforcement Learning for Decentralized Traffic Signal Control [54.162449208797334]
Traffic signal control aims to coordinate traffic signals across intersections to improve the traffic efficiency of a district or a city.
Deep reinforcement learning (RL) has been applied to traffic signal control recently and demonstrated promising performance where each traffic signal is regarded as an agent.
We propose a novel Meta Variationally Intrinsic Motivated (MetaVIM) RL method to learn the decentralized policy for each intersection that considers neighbor information in a latent way.
arXiv Detail & Related papers (2021-01-04T03:06:08Z) - Efficiency and Equity are Both Essential: A Generalized Traffic Signal
Controller with Deep Reinforcement Learning [25.21831641893209]
We present an approach to learning policies for signal controllers using deep reinforcement learning aiming for optimized traffic flow.
Our method uses a novel formulation of the reward function that simultaneously considers efficiency and equity.
The experimental evaluations on both simulated and real-world data demonstrate that our proposed algorithm achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-09T11:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.