Real-time system optimal traffic routing under uncertainties -- Can physics models boost reinforcement learning?
- URL: http://arxiv.org/abs/2407.07364v1
- Date: Wed, 10 Jul 2024 04:53:26 GMT
- Title: Real-time system optimal traffic routing under uncertainties -- Can physics models boost reinforcement learning?
- Authors: Zemian Ke, Qiling Zou, Jiachao Liu, Sean Qian,
- Abstract summary: Our paper presents TransRL, a novel algorithm that integrates reinforcement learning with physics models for enhanced performance, reliability, and interpretability.
By leveraging the information from physics models, TransRL consistently outperforms state-of-the-art reinforcement learning algorithms.
- Score: 2.298129181817085
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: System optimal traffic routing can mitigate congestion by assigning routes for a portion of vehicles so that the total travel time of all vehicles in the transportation system can be reduced. However, achieving real-time optimal routing poses challenges due to uncertain demands and unknown system dynamics, particularly in expansive transportation networks. While physics model-based methods are sensitive to uncertainties and model mismatches, model-free reinforcement learning struggles with learning inefficiencies and interpretability issues. Our paper presents TransRL, a novel algorithm that integrates reinforcement learning with physics models for enhanced performance, reliability, and interpretability. TransRL begins by establishing a deterministic policy grounded in physics models, from which it learns from and is guided by a differentiable and stochastic teacher policy. During training, TransRL aims to maximize cumulative rewards while minimizing the Kullback Leibler (KL) divergence between the current policy and the teacher policy. This approach enables TransRL to simultaneously leverage interactions with the environment and insights from physics models. We conduct experiments on three transportation networks with up to hundreds of links. The results demonstrate TransRL's superiority over traffic model-based methods for being adaptive and learning from the actual network data. By leveraging the information from physics models, TransRL consistently outperforms state-of-the-art reinforcement learning algorithms such as proximal policy optimization (PPO) and soft actor critic (SAC). Moreover, TransRL's actions exhibit higher reliability and interpretability compared to baseline reinforcement learning approaches like PPO and SAC.
Related papers
- Physics Enhanced Residual Policy Learning (PERPL) for safety cruising in mixed traffic platooning under actuator and communication delay [8.172286651098027]
Linear control models have gained extensive application in vehicle control due to their simplicity, ease of use, and support for stability analysis.
Reinforcement learning (RL) models, on the other hand, offer adaptability but suffer from a lack of interpretability and generalization capabilities.
This paper aims to develop a family of RL-based controllers enhanced by physics-informed policies.
arXiv Detail & Related papers (2024-09-23T23:02:34Z) - Traffic expertise meets residual RL: Knowledge-informed model-based residual reinforcement learning for CAV trajectory control [1.5361702135159845]
This paper introduces a knowledge-informed model-based residual reinforcement learning framework.
It integrates traffic expert knowledge into a virtual environment model, employing the Intelligent Driver Model (IDM) for basic dynamics and neural networks for residual dynamics.
We propose a novel strategy that combines traditional control methods with residual RL, facilitating efficient learning and policy optimization without the need to learn from scratch.
arXiv Detail & Related papers (2024-08-30T16:16:57Z) - Continual Learning for Adaptable Car-Following in Dynamic Traffic Environments [16.587883982785]
The continual evolution of autonomous driving technology requires car-following models that can adapt to diverse and dynamic traffic environments.
Traditional learning-based models often suffer from performance degradation when encountering unseen traffic patterns due to a lack of continual learning capabilities.
This paper proposes a novel car-following model based on continual learning that addresses this limitation.
arXiv Detail & Related papers (2024-07-17T06:32:52Z) - Reinforcement Learning with Human Feedback for Realistic Traffic
Simulation [53.85002640149283]
Key element of effective simulation is the incorporation of realistic traffic models that align with human knowledge.
This study identifies two main challenges: capturing the nuances of human preferences on realism and the unification of diverse traffic simulation models.
arXiv Detail & Related papers (2023-09-01T19:29:53Z) - Prompt to Transfer: Sim-to-Real Transfer for Traffic Signal Control with
Prompt Learning [4.195122359359966]
Large Language Models (LLMs) are trained on mass knowledge and proved to be equipped with astonishing inference abilities.
In this work, we leverage LLMs to understand and profile the system dynamics by a prompt-based grounded action transformation.
arXiv Detail & Related papers (2023-08-28T03:49:13Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Physics-Inspired Temporal Learning of Quadrotor Dynamics for Accurate
Model Predictive Trajectory Tracking [76.27433308688592]
Accurately modeling quadrotor's system dynamics is critical for guaranteeing agile, safe, and stable navigation.
We present a novel Physics-Inspired Temporal Convolutional Network (PI-TCN) approach to learning quadrotor's system dynamics purely from robot experience.
Our approach combines the expressive power of sparse temporal convolutions and dense feed-forward connections to make accurate system predictions.
arXiv Detail & Related papers (2022-06-07T13:51:35Z) - Objective-aware Traffic Simulation via Inverse Reinforcement Learning [31.26257563160961]
We formulate traffic simulation as an inverse reinforcement learning problem.
We propose a parameter sharing adversarial inverse reinforcement learning model for dynamics-robust simulation learning.
Our proposed model is able to imitate a vehicle's trajectories in the real world while simultaneously recovering the reward function.
arXiv Detail & Related papers (2021-05-20T07:26:34Z) - Vehicular Cooperative Perception Through Action Branching and Federated
Reinforcement Learning [101.64598586454571]
A novel framework is proposed to allow reinforcement learning-based vehicular association, resource block (RB) allocation, and content selection of cooperative perception messages (CPMs)
A federated RL approach is introduced in order to speed up the training process across vehicles.
Results show that federated RL improves the training process, where better policies can be achieved within the same amount of time compared to the non-federated approach.
arXiv Detail & Related papers (2020-12-07T02:09:15Z) - Model-Based Meta-Reinforcement Learning for Flight with Suspended
Payloads [69.21503033239985]
Transporting suspended payloads is challenging for autonomous aerial vehicles.
We propose a meta-learning approach that "learns how to learn" models of altered dynamics within seconds of post-connection flight data.
arXiv Detail & Related papers (2020-04-23T17:43:56Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.