Related papers: Offline Reinforcement Learning for Road Traffic Control

Offline Reinforcement Learning for Road Traffic Control

URL: http://arxiv.org/abs/2201.02381v1
Date: Fri, 7 Jan 2022 09:55:21 GMT
Title: Offline Reinforcement Learning for Road Traffic Control
Authors: Mayuresh Kunjir and Sanjay Chawla
Abstract summary: We build a model-based learning framework, A-DAC, which infers a Markov Decision Process (MDP) from dataset with pessimistic costs built in to deal with data uncertainties. A-DAC is evaluated on a complex signalized roundabout using multiple datasets varying in size and in batch collection policy.
Score: 12.251816544079306
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Traffic signal control is an important problem in urban mobility with a significant potential of economic and environmental impact. While there is a growing interest in Reinforcement Learning (RL) for traffic control, the work so far has focussed on learning through interactions which, in practice, is costly. Instead, real experience data on traffic is available and could be exploited at minimal costs. Recent progress in offline or batch RL has enabled just that. Model-based offline RL methods, in particular, have been shown to generalize to the experience data much better than others. We build a model-based learning framework, A-DAC, which infers a Markov Decision Process (MDP) from dataset with pessimistic costs built in to deal with data uncertainties. The costs are modeled through an adaptive shaping of rewards in the MDP which provides better regularization of data compared to the prior related work. A-DAC is evaluated on a complex signalized roundabout using multiple datasets varying in size and in batch collection policy. The evaluation results show that it is possible to build high performance control policies in a data efficient manner using simplistic batch collection policies.

Related papers

OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control [1.2540429019617183]
We introduce OffLight, a novel offline MARL framework designed to handle heterogeneous behavior policies in TSC datasets. OffLight incorporates Importance Sampling (IS) to correct for distributional shifts and Return-Based Prioritized Sampling (RBPS) to focus on high-quality experiences. Experiments show OffLight outperforms existing offline RL methods, achieving up to a 7.8% reduction in average travel time and 11.2% decrease in queue length.
arXiv Detail & Related papers (2024-11-10T21:26:17Z)
OffRIPP: Offline RL-based Informative Path Planning [12.705099730591671]
IPP is a crucial task in robotics, where agents must design paths to gather valuable information about a target environment. We propose an offline RL-based IPP framework that optimized information gain without requiring real-time interaction during training. We validate the framework through extensive simulations and real-world experiments.
arXiv Detail & Related papers (2024-09-25T11:30:59Z)
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments. Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z)
A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning [18.2541182874636]
We propose a fully Data-Driven and simulator-free framework for realistic Traffic Signal Control (D2TSC) We combine well-established traffic flow theory with machine learning to infer the reward signals from coarse-grained traffic data. Our approach achieves superior performance over conventional and offline RL baselines, and also enjoys much better real-world applicability.
arXiv Detail & Related papers (2023-11-27T15:29:21Z)
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning [7.462336024223669]
We propose a model-based data augmentation strategy, Trajectory Stitching (TS), to improve the quality of sub-optimal historical trajectories. TS introduces unseen actions joining previously disconnected states. We show that using this data augmentation strategy jointly with behavioural cloning (BC) leads to improvements over the behaviour-cloned policy.
arXiv Detail & Related papers (2022-11-21T16:00:39Z)
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning [119.85598717477016]
We argue that a natural use case of offline RL is in settings where we can pool large amounts of data collected in various scenarios for solving different tasks. We develop a simple technique for data-sharing in multi-task offline RL that routes data based on the improvement over the task-specific data.
arXiv Detail & Related papers (2021-09-16T17:34:06Z)
Critic Regularized Regression [70.8487887738354]
We propose a novel offline RL algorithm to learn policies from data using a form of critic-regularized regression (CRR) We find that CRR performs surprisingly well and scales to tasks with high-dimensional state and action spaces.
arXiv Detail & Related papers (2020-06-26T17:50:26Z)
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting. We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience. Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL. By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.