Related papers: OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control

OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control

URL: http://arxiv.org/abs/2411.06601v2
Date: Mon, 25 Nov 2024 15:17:30 GMT
Title: OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control
Authors: Rohit Bokade, Xiaoning Jin,
Abstract summary: We introduce OffLight, a novel offline MARL framework designed to handle heterogeneous behavior policies in TSC datasets. OffLight incorporates Importance Sampling (IS) to correct for distributional shifts and Return-Based Prioritized Sampling (RBPS) to focus on high-quality experiences. Experiments show OffLight outperforms existing offline RL methods, achieving up to a 7.8% reduction in average travel time and 11.2% decrease in queue length.
Score: 1.2540429019617183
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Efficient traffic control (TSC) is essential for urban mobility, but traditional systems struggle to handle the complexity of real-world traffic. Multi-agent Reinforcement Learning (MARL) offers adaptive solutions, but online MARL requires extensive interactions with the environment, making it costly and impractical. Offline MARL mitigates these challenges by using historical traffic data for training but faces significant difficulties with heterogeneous behavior policies in real-world datasets, where mixed-quality data complicates learning. We introduce OffLight, a novel offline MARL framework designed to handle heterogeneous behavior policies in TSC datasets. To improve learning efficiency, OffLight incorporates Importance Sampling (IS) to correct for distributional shifts and Return-Based Prioritized Sampling (RBPS) to focus on high-quality experiences. OffLight utilizes a Gaussian Mixture Variational Graph Autoencoder (GMM-VGAE) to capture the diverse distribution of behavior policies from local observations. Extensive experiments across real-world urban traffic scenarios show that OffLight outperforms existing offline RL methods, achieving up to a 7.8% reduction in average travel time and 11.2% decrease in queue length. Ablation studies confirm the effectiveness of OffLight's components in handling heterogeneous data and improving policy performance. These results highlight OffLight's scalability and potential to improve urban traffic management without the risks of online learning.

Related papers

Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay [61.823835392216544]
Reinforcement learning (RL) has become an effective approach for fine-tuning large language models (LLMs)<n>We propose two techniques to improve data efficiency in LLM RL fine-tuning: difficulty-targeted online data selection and rollout replay.<n>Our method reduces RL fine-tuning time by 25% to 65% to reach the same level of performance as the original GRPO algorithm.
arXiv Detail & Related papers (2025-06-05T17:55:43Z)
Active Advantage-Aligned Online Reinforcement Learning with Offline Data [56.98480620108727]
We introduce A3RL, which incorporates a novel confidence aware Active Advantage Aligned sampling strategy.<n>We demonstrate that our method outperforms competing online RL techniques that leverage offline data.
arXiv Detail & Related papers (2025-02-11T20:31:59Z)
TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning [61.33599727106222]
TeLL-Drive is a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy. A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness.
arXiv Detail & Related papers (2025-02-03T14:22:03Z)
Strada-LLM: Graph LLM for traffic prediction [62.2015839597764]
A considerable challenge in traffic prediction lies in handling the diverse data distributions caused by vastly different traffic conditions. We propose a graph-aware LLM for traffic prediction that considers proximal traffic information. We adopt a lightweight approach for efficient domain adaptation when facing new data distributions in few-shot fashion.
arXiv Detail & Related papers (2024-10-28T09:19:29Z)
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments. Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z)
LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments [3.7788636451616697]
This work introduces an innovative approach that integrates Large Language Models into traffic signal control systems. A hybrid framework that augments LLMs with a suite of perception and decision-making tools is proposed. The findings from our simulations attest to the system's adeptness in adjusting to a multiplicity of traffic environments.
arXiv Detail & Related papers (2024-03-13T08:41:55Z)
Traffic Signal Control Using Lightweight Transformers: An Offline-to-Online RL Approach [6.907105812732423]
We propose DTLight, a lightweight Decision Transformer-based TSC method that can learn policy from easily accessible offline datasets. DTLight pre-trained purely on offline datasets can outperform state-of-the-art online RL-based methods in most scenarios. Experiment results also show that online fine-tuning further improves the performance of DTLight by up to 42.6% over the best online RL baseline methods.
arXiv Detail & Related papers (2023-12-12T23:21:57Z)
A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning [18.2541182874636]
We propose a fully Data-Driven and simulator-free framework for realistic Traffic Signal Control (D2TSC) We combine well-established traffic flow theory with machine learning to infer the reward signals from coarse-grained traffic data. Our approach achieves superior performance over conventional and offline RL baselines, and also enjoys much better real-world applicability.
arXiv Detail & Related papers (2023-11-27T15:29:21Z)
Adaptive Resource Allocation for Virtualized Base Stations in O-RAN with Online Learning [60.17407932691429]
Open Radio Access Network systems, with their base stations (vBSs), offer operators the benefits of increased flexibility, reduced costs, vendor diversity, and interoperability. We propose an online learning algorithm that balances the effective throughput and vBS energy consumption, even under unforeseeable and "challenging'' environments. We prove the proposed solutions achieve sub-linear regret, providing zero average optimality gap even in challenging environments.
arXiv Detail & Related papers (2023-09-04T17:30:21Z)
DataLight: Offline Data-Driven Traffic Signal Control [9.393196900855648]
Reinforcement learning (RL) has emerged as a promising solution for addressing traffic signal control (TSC) challenges. This study introduces an innovative offline data-driven approach, called DataLight. DataLight employs effective state representations and reward function by capturing vehicular speed information.
arXiv Detail & Related papers (2023-03-20T02:02:50Z)
Offline Reinforcement Learning for Road Traffic Control [12.251816544079306]
We build a model-based learning framework, A-DAC, which infers a Markov Decision Process (MDP) from dataset with pessimistic costs built in to deal with data uncertainties. A-DAC is evaluated on a complex signalized roundabout using multiple datasets varying in size and in batch collection policy.
arXiv Detail & Related papers (2022-01-07T09:55:21Z)
An Experimental Urban Case Study with Various Data Sources and a Model for Traffic Estimation [65.28133251370055]
We organize an experimental campaign with video measurement in an area within the urban network of Zurich, Switzerland. We focus on capturing the traffic state in terms of traffic flow and travel times by ensuring measurements from established thermal cameras. We propose a simple yet efficient Multiple Linear Regression (MLR) model to estimate travel times with fusion of various data sources.
arXiv Detail & Related papers (2021-08-02T08:13:57Z)
Offline Meta-Reinforcement Learning with Online Self-Supervision [66.42016534065276]
We propose a hybrid offline meta-RL algorithm, which uses offline data with rewards to meta-train an adaptive policy. Our method uses the offline data to learn the distribution of reward functions, which is then sampled to self-supervise reward labels for the additional online data. We find that using additional data and self-generated rewards significantly improves an agent's ability to generalize.
arXiv Detail & Related papers (2021-07-08T17:01:32Z)
MetaVIM: Meta Variationally Intrinsic Motivated Reinforcement Learning for Decentralized Traffic Signal Control [54.162449208797334]
Traffic signal control aims to coordinate traffic signals across intersections to improve the traffic efficiency of a district or a city. Deep reinforcement learning (RL) has been applied to traffic signal control recently and demonstrated promising performance where each traffic signal is regarded as an agent. We propose a novel Meta Variationally Intrinsic Motivated (MetaVIM) RL method to learn the decentralized policy for each intersection that considers neighbor information in a latent way.
arXiv Detail & Related papers (2021-01-04T03:06:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.