Learning to Control Autonomous Fleets from Observation via Offline
Reinforcement Learning
- URL: http://arxiv.org/abs/2302.14833v2
- Date: Fri, 25 Aug 2023 14:28:20 GMT
- Title: Learning to Control Autonomous Fleets from Observation via Offline
Reinforcement Learning
- Authors: Carolin Schmidt, Daniele Gammelli, Francisco Camara Pereira, Filipe
Rodrigues
- Abstract summary: We propose to formalize the control of Autonomous Mobility-on-Demand systems through the lens of offline reinforcement learning.
We show that offline RL is a promising paradigm for the application of RL-based solutions within economically-critical systems.
- Score: 3.9121134770873733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous Mobility-on-Demand (AMoD) systems are an evolving mode of
transportation in which a centrally coordinated fleet of self-driving vehicles
dynamically serves travel requests. The control of these systems is typically
formulated as a large network optimization problem, and reinforcement learning
(RL) has recently emerged as a promising approach to solve the open challenges
in this space. Recent centralized RL approaches focus on learning from online
data, ignoring the per-sample-cost of interactions within real-world
transportation systems. To address these limitations, we propose to formalize
the control of AMoD systems through the lens of offline reinforcement learning
and learn effective control strategies using solely offline data, which is
readily available to current mobility operators. We further investigate design
decisions and provide empirical evidence based on data from real-world mobility
systems showing how offline learning allows to recover AMoD control policies
that (i) exhibit performance on par with online methods, (ii) allow for
sample-efficient online fine-tuning and (iii) eliminate the need for complex
simulation environments. Crucially, this paper demonstrates that offline RL is
a promising paradigm for the application of RL-based solutions within
economically-critical systems, such as mobility systems.
Related papers
- MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot
Learning [52.101643259906915]
We study the problem of offline pre-training and online fine-tuning for reinforcement learning from high-dimensional observations.
Existing model-based offline RL methods are not suitable for offline-to-online fine-tuning in high-dimensional domains.
We propose an on-policy model-based method that can efficiently reuse prior data through model-based value expansion and policy regularization.
arXiv Detail & Related papers (2024-01-06T21:04:31Z) - A Fully Data-Driven Approach for Realistic Traffic Signal Control Using
Offline Reinforcement Learning [18.2541182874636]
We propose a fully Data-Driven and simulator-free framework for realistic Traffic Signal Control (D2TSC)
We combine well-established traffic flow theory with machine learning to infer the reward signals from coarse-grained traffic data.
Our approach achieves superior performance over conventional and offline RL baselines, and also enjoys much better real-world applicability.
arXiv Detail & Related papers (2023-11-27T15:29:21Z) - ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles [52.34951901588738]
We propose a novel framework called ENsemble-based Offline-To-Online (ENOTO) RL.
By increasing the number of Q-networks, we seamlessly bridge offline pre-training and online fine-tuning without degrading performance.
Experimental results demonstrate that ENOTO can substantially improve the training stability, learning efficiency, and final performance of existing offline RL methods.
arXiv Detail & Related papers (2023-06-12T05:10:10Z) - Efficient Learning of Voltage Control Strategies via Model-based Deep
Reinforcement Learning [9.936452412191326]
This article proposes a model-based deep reinforcement learning (DRL) method to design emergency control strategies for short-term voltage stability problems in power systems.
Recent advances show promising results in model-free DRL-based methods for power systems, but model-free methods suffer from poor sample efficiency and training time.
We propose a novel model-based-DRL framework where a deep neural network (DNN)-based dynamic surrogate model is utilized with the policy learning framework.
arXiv Detail & Related papers (2022-12-06T02:50:53Z) - Unified Automatic Control of Vehicular Systems with Reinforcement
Learning [64.63619662693068]
This article contributes a streamlined methodology for vehicular microsimulation.
It discovers high performance control strategies with minimal manual design.
The study reveals numerous emergent behaviors resembling wave mitigation, traffic signaling, and ramp metering.
arXiv Detail & Related papers (2022-07-30T16:23:45Z) - Challenges and Opportunities in Offline Reinforcement Learning from
Visual Observations [58.758928936316785]
offline reinforcement learning from visual observations with continuous action spaces remains under-explored.
We show that modifications to two popular vision-based online reinforcement learning algorithms suffice to outperform existing offline RL methods.
arXiv Detail & Related papers (2022-06-09T22:08:47Z) - UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning
Leveraging Planning [1.1339580074756188]
Offline reinforcement learning (RL) provides a framework for learning decision-making from offline data.
Self-driving vehicles (SDV) learn a policy, which potentially even outperforms the behavior in the sub-optimal data set.
This motivates the use of model-based offline RL approaches, which leverage planning.
arXiv Detail & Related papers (2021-11-22T10:37:52Z) - Efficient Robotic Manipulation Through Offline-to-Online Reinforcement
Learning and Goal-Aware State Information [5.604859261995801]
We propose a unified offline-to-online RL framework that resolves the transition performance drop issue.
We introduce goal-aware state information to the RL agent, which can greatly reduce task complexity and accelerate policy learning.
Our framework achieves great training efficiency and performance compared with the state-of-the-art methods in multiple robotic manipulation tasks.
arXiv Detail & Related papers (2021-10-21T05:34:25Z) - A Deep Reinforcement Learning Approach for Traffic Signal Control
Optimization [14.455497228170646]
Inefficient traffic signal control methods may cause numerous problems, such as traffic congestion and waste of energy.
This paper first proposes a multi-agent deep deterministic policy gradient (MADDPG) method by extending the actor-critic policy gradient algorithms.
arXiv Detail & Related papers (2021-07-13T14:11:04Z) - Offline Reinforcement Learning from Images with Latent Space Models [60.69745540036375]
offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions.
We build on recent advances in model-based algorithms for offline RL, and extend them to high-dimensional visual observation spaces.
Our approach is both tractable in practice and corresponds to maximizing a lower bound of the ELBO in the unknown POMDP.
arXiv Detail & Related papers (2020-12-21T18:28:17Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.