Related papers: Traffic Signal Control Using Lightweight Transformers: An Offline-to-Online RL Approach

Traffic Signal Control Using Lightweight Transformers: An Offline-to-Online RL Approach

URL: http://arxiv.org/abs/2312.07795v1
Date: Tue, 12 Dec 2023 23:21:57 GMT
Title: Traffic Signal Control Using Lightweight Transformers: An Offline-to-Online RL Approach
Authors: Xingshuai Huang, Di Wu, and Benoit Boulet
Abstract summary: We propose DTLight, a lightweight Decision Transformer-based TSC method that can learn policy from easily accessible offline datasets. DTLight pre-trained purely on offline datasets can outperform state-of-the-art online RL-based methods in most scenarios. Experiment results also show that online fine-tuning further improves the performance of DTLight by up to 42.6% over the best online RL baseline methods.
Score: 6.907105812732423
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Efficient traffic signal control is critical for reducing traffic congestion and improving overall transportation efficiency. The dynamic nature of traffic flow has prompted researchers to explore Reinforcement Learning (RL) for traffic signal control (TSC). Compared with traditional methods, RL-based solutions have shown preferable performance. However, the application of RL-based traffic signal controllers in the real world is limited by the low sample efficiency and high computational requirements of these solutions. In this work, we propose DTLight, a simple yet powerful lightweight Decision Transformer-based TSC method that can learn policy from easily accessible offline datasets. DTLight novelly leverages knowledge distillation to learn a lightweight controller from a well-trained larger teacher model to reduce implementation computation. Additionally, it integrates adapter modules to mitigate the expenses associated with fine-tuning, which makes DTLight practical for online adaptation with minimal computation and only a few fine-tuning steps during real deployment. Moreover, DTLight is further enhanced to be more applicable to real-world TSC problems. Extensive experiments on synthetic and real-world scenarios show that DTLight pre-trained purely on offline datasets can outperform state-of-the-art online RL-based methods in most scenarios. Experiment results also show that online fine-tuning further improves the performance of DTLight by up to 42.6% over the best online RL baseline methods. In this work, we also introduce Datasets specifically designed for TSC with offline RL (referred to as DTRL). Our datasets and code are publicly available.

Related papers

FitLight: Federated Imitation Learning for Plug-and-Play Autonomous Traffic Signal Control [33.547772623142414]
Reinforcement Learning (RL)-based Traffic Signal Control (TSC) methods raise some serious issues such as high learning cost and poor generalizability. We propose a novel Federated Imitation Learning (FIL)-based framework for multi-intersection TSC, named FitLight. FitLight allows real-time imitation learning and seamless transition to reinforcement learning.
arXiv Detail & Related papers (2025-02-17T15:48:46Z)
Adaptive Data Exploitation in Deep Reinforcement Learning [50.53705050673944]
We introduce ADEPT, a powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL) Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms. We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-01-22T04:01:17Z)
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments. Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z)
Offline Trajectory Generalization for Offline Reinforcement Learning [43.89740983387144]
offline reinforcement learning (RL) aims to learn policies from static datasets of previously collected trajectories. We propose offline trajectory generalization through world transformers for offline reinforcement learning (OTTO) OTTO serves as a plug-in module and can be integrated with existing offline RL methods to enhance them with better generalization capability of transformers and high-rewarded data augmentation.
arXiv Detail & Related papers (2024-04-16T08:48:46Z)
Solving Continual Offline Reinforcement Learning with Decision Transformer [78.59473797783673]
Continuous offline reinforcement learning (CORL) combines continuous and offline reinforcement learning. Existing methods, employing Actor-Critic structures and experience replay (ER), suffer from distribution shifts, low efficiency, and weak knowledge-sharing. We introduce multi-head DT (MH-DT) and low-rank adaptation DT (LoRA-DT) to mitigate DT's forgetting problem.
arXiv Detail & Related papers (2024-01-16T16:28:32Z)
A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning [18.2541182874636]
We propose a fully Data-Driven and simulator-free framework for realistic Traffic Signal Control (D2TSC) We combine well-established traffic flow theory with machine learning to infer the reward signals from coarse-grained traffic data. Our approach achieves superior performance over conventional and offline RL baselines, and also enjoys much better real-world applicability.
arXiv Detail & Related papers (2023-11-27T15:29:21Z)
Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning [52.49786369812919]
We propose a memory technique, (Prioritized) Trajectory Replay (TR/PTR), which extends the sampling perspective to trajectories. TR enhances learning efficiency by backward sampling of trajectories that optimize the use of subsequent state information. We demonstrate the benefits of integrating TR and PTR with existing offline RL algorithms on D4RL.
arXiv Detail & Related papers (2023-06-27T14:29:44Z)
Prompt-Tuning Decision Transformer with Preference Ranking [83.76329715043205]
We propose the Prompt-Tuning DT algorithm to address challenges by using trajectory segments as prompts to guide RL agents in acquiring environmental information. Our approach involves randomly sampling a Gaussian distribution to fine-tune the elements of the prompt trajectory and using preference ranking function to find the optimization direction. Our work contributes to the advancement of prompt-tuning approaches in RL, providing a promising direction for optimizing large RL agents for specific preference tasks.
arXiv Detail & Related papers (2023-05-16T17:49:04Z)
DataLight: Offline Data-Driven Traffic Signal Control [9.393196900855648]
Reinforcement learning (RL) has emerged as a promising solution for addressing traffic signal control (TSC) challenges. This study introduces an innovative offline data-driven approach, called DataLight. DataLight employs effective state representations and reward function by capturing vehicular speed information.
arXiv Detail & Related papers (2023-03-20T02:02:50Z)
Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online. We extensively ablate these design choices, demonstrating the key factors that most affect performance. We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z)
Efficient Pressure: Improving efficiency for signalized intersections [24.917612761503996]
Reinforcement learning (RL) has attracted more attention to help solve the traffic signal control (TSC) problem. Existing RL-based methods are rarely deployed considering that they are neither cost-effective in terms of computing resources nor more robust than traditional approaches. We demonstrate how to construct an adaptive controller for TSC with less training and reduced complexity based on RL-based approach.
arXiv Detail & Related papers (2021-12-04T13:49:58Z)
ModelLight: Model-Based Meta-Reinforcement Learning for Traffic Signal Control [5.219291917441908]
This paper proposes a novel model-based meta-reinforcement learning framework (ModelLight) for traffic signal control. Within ModelLight, an ensemble of models for road intersections and the optimization-based meta-learning method are used to improve the data efficiency of an RL-based traffic light control method. Experiments on real-world datasets demonstrate that ModelLight can outperform state-of-the-art traffic light control algorithms.
arXiv Detail & Related papers (2021-11-15T20:25:08Z)
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience. Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.