Related papers: DROP: Deep relocating option policy for optimal ride-hailing vehicle repositioning

DROP: Deep relocating option policy for optimal ride-hailing vehicle repositioning

URL: http://arxiv.org/abs/2109.04149v1
Date: Thu, 9 Sep 2021 10:20:53 GMT
Title: DROP: Deep relocating option policy for optimal ride-hailing vehicle repositioning
Authors: Xinwu Qian, Shuocheng Guo, Vaneet Aggarwal
Abstract summary: In a ride-hailing system, an optimal relocation of vacant vehicles can significantly reduce fleet idling time and balance the supply-demand distribution. This study proposes the deep relocating option policy (DROP) that supervises vehicle agents to escape from oversupply areas. We present a hierarchical learning framework that trains a high-level relocation policy and a set of low-level DROPs.
Score: 36.31945021412277
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In a ride-hailing system, an optimal relocation of vacant vehicles can significantly reduce fleet idling time and balance the supply-demand distribution, enhancing system efficiency and promoting driver satisfaction and retention. Model-free deep reinforcement learning (DRL) has been shown to dynamically learn the relocating policy by actively interacting with the intrinsic dynamics in large-scale ride-hailing systems. However, the issues of sparse reward signals and unbalanced demand and supply distribution place critical barriers in developing effective DRL models. Conventional exploration strategy (e.g., the $\epsilon$-greedy) may barely work under such an environment because of dithering in low-demand regions distant from high-revenue regions. This study proposes the deep relocating option policy (DROP) that supervises vehicle agents to escape from oversupply areas and effectively relocate to potentially underserved areas. We propose to learn the Laplacian embedding of a time-expanded relocation graph, as an approximation representation of the system relocation policy. The embedding generates task-agnostic signals, which in combination with task-dependent signals, constitute the pseudo-reward function for generating DROPs. We present a hierarchical learning framework that trains a high-level relocation policy and a set of low-level DROPs. The effectiveness of our approach is demonstrated using a custom-built high-fidelity simulator with real-world trip record data. We report that DROP significantly improves baseline models with 15.7% more hourly revenue and can effectively resolve the dithering issue in low-demand areas.

Related papers

Dynamically Local-Enhancement Planner for Large-Scale Autonomous Driving [15.68766075910435]
We introduce the concept of dynamically enhancing a basic driving planner with local driving data, without permanently modifying the planner itself. Our approach introduces a position-varying Markov Decision Process formulation and a graph neural network that extracts region-specific driving features from local observation data. The results show that our method outperforms the baseline policy in both safety (collision rate) and average reward, while maintaining a lighter scale.
arXiv Detail & Related papers (2025-02-28T15:17:20Z)
End-to-end Driving in High-Interaction Traffic Scenarios with Reinforcement Learning [24.578178308010912]
We propose an end-to-end model-based RL algorithm named Ramble to address these issues. By learning a dynamics model of the environment, Ramble can foresee upcoming traffic events and make more informed, strategic decisions. Ramble achieves state-of-the-art performance regarding route completion rate and driving score on the CARLA Leaderboard 2.0, showcasing its effectiveness in managing complex and dynamic traffic situations.
arXiv Detail & Related papers (2024-10-03T06:45:59Z)
Reconfigurable Intelligent Surface Assisted VEC Based on Multi-Agent Reinforcement Learning [33.620752444256716]
Vehicular edge computing enables vehicles to perform high-intensity tasks by executing tasks locally or offloading them to nearby edge devices. Reassisted (RIS) is introduced to support vehicle communication and provide alternative communication path. We propose a new deep reinforcement learning framework that employs modified multiagent deep deterministic gradient policy.
arXiv Detail & Related papers (2024-06-17T08:35:32Z)
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent. Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z)
Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation [78.17108227614928]
We propose a benchmark environment for Safe Reinforcement Learning focusing on aquatic navigation. We consider a value-based and policy-gradient Deep Reinforcement Learning (DRL) We also propose a verification strategy that checks the behavior of the trained models over a set of desired properties.
arXiv Detail & Related papers (2021-12-16T16:53:56Z)
Learning Space Partitions for Path Planning [54.475949279050596]
PlaLaM outperforms existing path planning methods in 2D navigation tasks, especially in the presence of difficult-to-escape local optima. These gains transfer to highly multimodal real-world tasks, where we outperform strong baselines in compiler phase ordering by up to 245% and in molecular design by up to 0.4 on properties on a 0-1 scale.
arXiv Detail & Related papers (2021-06-19T18:06:11Z)
A Modular and Transferable Reinforcement Learning Framework for the Fleet Rebalancing Problem [2.299872239734834]
We propose a modular framework for fleet rebalancing based on model-free reinforcement learning (RL) We formulate RL state and action spaces as distributions over a grid of the operating area, making the framework scalable. Numerical experiments, using real-world trip and network data, demonstrate that this approach has several distinct advantages over baseline methods.
arXiv Detail & Related papers (2021-05-27T16:32:28Z)
AdaPool: A Diurnal-Adaptive Fleet Management Framework using Model-Free Deep Reinforcement Learning and Change Point Detection [34.77250498401055]
This paper introduces an adaptive model-free deep reinforcement approach that can recognize and adapt to the diurnal patterns in the ride-sharing environment with car-pooling. In addition to the adaptation logic in dispatching, this paper also proposes a dynamic, demand-aware vehicle-passenger matching and route planning framework.
arXiv Detail & Related papers (2021-04-01T02:14:01Z)
Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement Learning [52.2663102239029]
We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle on idle-hailing platforms. Our approach learns ride-based state-value function using a batch training algorithm with deep value. We benchmark our algorithm with baselines in a ride-hailing simulation environment to demonstrate its superiority in improving income efficiency.
arXiv Detail & Related papers (2021-03-08T05:34:05Z)
Trajectory Planning for Autonomous Vehicles Using Hierarchical Reinforcement Learning [21.500697097095408]
Planning safe trajectories under uncertain and dynamic conditions makes the autonomous driving problem significantly complex. Current sampling-based methods such as Rapidly Exploring Random Trees (RRTs) are not ideal for this problem because of the high computational cost. We propose a Hierarchical Reinforcement Learning structure combined with a Proportional-Integral-Derivative (PID) controller for trajectory planning.
arXiv Detail & Related papers (2020-11-09T20:49:54Z)
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.