Traffic Signal Control Using Lightweight Transformers: An
Offline-to-Online RL Approach
- URL: http://arxiv.org/abs/2312.07795v1
- Date: Tue, 12 Dec 2023 23:21:57 GMT
- Title: Traffic Signal Control Using Lightweight Transformers: An
Offline-to-Online RL Approach
- Authors: Xingshuai Huang, Di Wu, and Benoit Boulet
- Abstract summary: We propose DTLight, a lightweight Decision Transformer-based TSC method that can learn policy from easily accessible offline datasets.
DTLight pre-trained purely on offline datasets can outperform state-of-the-art online RL-based methods in most scenarios.
Experiment results also show that online fine-tuning further improves the performance of DTLight by up to 42.6% over the best online RL baseline methods.
- Score: 6.907105812732423
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient traffic signal control is critical for reducing traffic congestion
and improving overall transportation efficiency. The dynamic nature of traffic
flow has prompted researchers to explore Reinforcement Learning (RL) for
traffic signal control (TSC). Compared with traditional methods, RL-based
solutions have shown preferable performance. However, the application of
RL-based traffic signal controllers in the real world is limited by the low
sample efficiency and high computational requirements of these solutions. In
this work, we propose DTLight, a simple yet powerful lightweight Decision
Transformer-based TSC method that can learn policy from easily accessible
offline datasets. DTLight novelly leverages knowledge distillation to learn a
lightweight controller from a well-trained larger teacher model to reduce
implementation computation. Additionally, it integrates adapter modules to
mitigate the expenses associated with fine-tuning, which makes DTLight
practical for online adaptation with minimal computation and only a few
fine-tuning steps during real deployment. Moreover, DTLight is further enhanced
to be more applicable to real-world TSC problems. Extensive experiments on
synthetic and real-world scenarios show that DTLight pre-trained purely on
offline datasets can outperform state-of-the-art online RL-based methods in
most scenarios. Experiment results also show that online fine-tuning further
improves the performance of DTLight by up to 42.6% over the best online RL
baseline methods. In this work, we also introduce Datasets specifically
designed for TSC with offline RL (referred to as DTRL). Our datasets and code
are publicly available.
Related papers
- D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - Offline Trajectory Generalization for Offline Reinforcement Learning [43.89740983387144]
offline reinforcement learning (RL) aims to learn policies from static datasets of previously collected trajectories.
We propose offline trajectory generalization through world transformers for offline reinforcement learning (OTTO)
OTTO serves as a plug-in module and can be integrated with existing offline RL methods to enhance them with better generalization capability of transformers and high-rewarded data augmentation.
arXiv Detail & Related papers (2024-04-16T08:48:46Z) - Solving Continual Offline Reinforcement Learning with Decision Transformer [78.59473797783673]
Continuous offline reinforcement learning (CORL) combines continuous and offline reinforcement learning.
Existing methods, employing Actor-Critic structures and experience replay (ER), suffer from distribution shifts, low efficiency, and weak knowledge-sharing.
We introduce multi-head DT (MH-DT) and low-rank adaptation DT (LoRA-DT) to mitigate DT's forgetting problem.
arXiv Detail & Related papers (2024-01-16T16:28:32Z) - A Fully Data-Driven Approach for Realistic Traffic Signal Control Using
Offline Reinforcement Learning [18.2541182874636]
We propose a fully Data-Driven and simulator-free framework for realistic Traffic Signal Control (D2TSC)
We combine well-established traffic flow theory with machine learning to infer the reward signals from coarse-grained traffic data.
Our approach achieves superior performance over conventional and offline RL baselines, and also enjoys much better real-world applicability.
arXiv Detail & Related papers (2023-11-27T15:29:21Z) - Prioritized Trajectory Replay: A Replay Memory for Data-driven
Reinforcement Learning [52.49786369812919]
We propose a memory technique, (Prioritized) Trajectory Replay (TR/PTR), which extends the sampling perspective to trajectories.
TR enhances learning efficiency by backward sampling of trajectories that optimize the use of subsequent state information.
We demonstrate the benefits of integrating TR and PTR with existing offline RL algorithms on D4RL.
arXiv Detail & Related papers (2023-06-27T14:29:44Z) - Prompt-Tuning Decision Transformer with Preference Ranking [83.76329715043205]
We propose the Prompt-Tuning DT algorithm to address challenges by using trajectory segments as prompts to guide RL agents in acquiring environmental information.
Our approach involves randomly sampling a Gaussian distribution to fine-tune the elements of the prompt trajectory and using preference ranking function to find the optimization direction.
Our work contributes to the advancement of prompt-tuning approaches in RL, providing a promising direction for optimizing large RL agents for specific preference tasks.
arXiv Detail & Related papers (2023-05-16T17:49:04Z) - DataLight: Offline Data-Driven Traffic Signal Control [9.393196900855648]
Reinforcement learning (RL) has emerged as a promising solution for addressing traffic signal control (TSC) challenges.
This study introduces an innovative offline data-driven approach, called DataLight.
DataLight employs effective state representations and reward function by capturing vehicular speed information.
arXiv Detail & Related papers (2023-03-20T02:02:50Z) - Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online.
We extensively ablate these design choices, demonstrating the key factors that most affect performance.
We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z) - Efficient Pressure: Improving efficiency for signalized intersections [24.917612761503996]
Reinforcement learning (RL) has attracted more attention to help solve the traffic signal control (TSC) problem.
Existing RL-based methods are rarely deployed considering that they are neither cost-effective in terms of computing resources nor more robust than traditional approaches.
We demonstrate how to construct an adaptive controller for TSC with less training and reduced complexity based on RL-based approach.
arXiv Detail & Related papers (2021-12-04T13:49:58Z) - ModelLight: Model-Based Meta-Reinforcement Learning for Traffic Signal
Control [5.219291917441908]
This paper proposes a novel model-based meta-reinforcement learning framework (ModelLight) for traffic signal control.
Within ModelLight, an ensemble of models for road intersections and the optimization-based meta-learning method are used to improve the data efficiency of an RL-based traffic light control method.
Experiments on real-world datasets demonstrate that ModelLight can outperform state-of-the-art traffic light control algorithms.
arXiv Detail & Related papers (2021-11-15T20:25:08Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.