Value Function is All You Need: A Unified Learning Framework for Ride
Hailing Platforms
- URL: http://arxiv.org/abs/2105.08791v2
- Date: Thu, 20 May 2021 01:04:34 GMT
- Title: Value Function is All You Need: A Unified Learning Framework for Ride
Hailing Platforms
- Authors: Xiaocheng Tang, Fan Zhang, Zhiwei Qin, Yansheng Wang, Dingyuan Shi,
Bingchen Song, Yongxin Tong, Hongtu Zhu, Jieping Ye
- Abstract summary: Large ride-hailing platforms, such as DiDi, Uber and Lyft, connect tens of thousands of vehicles in a city to millions of ride demands throughout the day.
We propose a unified value-based dynamic learning framework (V1D3) for tackling both tasks.
- Score: 57.21078336887961
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large ride-hailing platforms, such as DiDi, Uber and Lyft, connect tens of
thousands of vehicles in a city to millions of ride demands throughout the day,
providing great promises for improving transportation efficiency through the
tasks of order dispatching and vehicle repositioning. Existing studies,
however, usually consider the two tasks in simplified settings that hardly
address the complex interactions between the two, the real-time fluctuations
between supply and demand, and the necessary coordinations due to the
large-scale nature of the problem. In this paper we propose a unified
value-based dynamic learning framework (V1D3) for tackling both tasks. At the
center of the framework is a globally shared value function that is updated
continuously using online experiences generated from real-time platform
transactions. To improve the sample-efficiency and the robustness, we further
propose a novel periodic ensemble method combining the fast online learning
with a large-scale offline training scheme that leverages the abundant
historical driver trajectory data. This allows the proposed framework to adapt
quickly to the highly dynamic environment, to generalize robustly to recurrent
patterns and to drive implicit coordinations among the population of managed
vehicles. Extensive experiments based on real-world datasets show considerably
improvements over other recently proposed methods on both tasks. Particularly,
V1D3 outperforms the first prize winners of both dispatching and repositioning
tracks in the KDD Cup 2020 RL competition, achieving state-of-the-art results
on improving both total driver income and user experience related metrics.
Related papers
- CoMamba: Real-time Cooperative Perception Unlocked with State Space Models [39.87600356189242]
CoMamba is a novel cooperative 3D detection framework designed to leverage state-space models for real-time onboard vehicle perception.
CoMamba achieves superior performance compared to existing methods while maintaining real-time processing capabilities.
arXiv Detail & Related papers (2024-09-16T20:02:19Z) - Importance Sampling-Guided Meta-Training for Intelligent Agents in Highly Interactive Environments [43.144056801987595]
This study introduces a novel training framework that integrates guided meta RL with importance sampling (IS) to optimize training distributions.
By estimating a naturalistic distribution from real-world datasets, the framework ensures a balanced focus across common and extreme driving scenarios.
arXiv Detail & Related papers (2024-07-22T17:57:12Z) - Dual Policy Reinforcement Learning for Real-time Rebalancing in Bike-sharing Systems [13.083156894368532]
Bike-sharing systems play a crucial role in easing traffic congestion and promoting healthier lifestyles.
This study introduces a novel approach to address the real-time rebalancing problem with a fleet of vehicles.
It employs a dual policy reinforcement learning algorithm that decouples inventory and routing decisions.
arXiv Detail & Related papers (2024-06-02T21:05:23Z) - A Reinforcement Learning Approach for Dynamic Rebalancing in
Bike-Sharing System [11.237099288412558]
Bike-Sharing Systems provide eco-friendly urban mobility, contributing to the alleviation of traffic congestion and healthier lifestyles.
Devising effective rebalancing strategies using vehicles to redistribute bikes among stations is therefore of uttermost importance for operators.
This paper introduces atemporal reinforcement learning algorithm for the dynamic rebalancing problem with multiple vehicles.
arXiv Detail & Related papers (2024-02-05T23:46:42Z) - Combinatorial Optimization enriched Machine Learning to solve the
Dynamic Vehicle Routing Problem with Time Windows [5.4807970361321585]
We propose a novel machine learning pipeline that incorporates an optimization layer.
We apply this pipeline to a dynamic vehicle routing problem with waves, which was recently promoted in the EURO Meets NeurIPS Competition at NeurIPS 2022.
Our methodology ranked first in this competition, outperforming all other approaches in solving the proposed dynamic vehicle routing problem.
arXiv Detail & Related papers (2023-04-03T08:23:09Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching.
We conduct large scale online A/B tests on DiDi's ride-dispatching platform.
Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z) - Flatland Competition 2020: MAPF and MARL for Efficient Train
Coordination on a Grid World [49.80905654161763]
The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP)
The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur.
The ever-growing complexity of modern railway networks makes dynamic real-time scheduling of traffic virtually impossible.
arXiv Detail & Related papers (2021-03-30T17:13:29Z) - Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement
Learning [52.2663102239029]
We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle on idle-hailing platforms.
Our approach learns ride-based state-value function using a batch training algorithm with deep value.
We benchmark our algorithm with baselines in a ride-hailing simulation environment to demonstrate its superiority in improving income efficiency.
arXiv Detail & Related papers (2021-03-08T05:34:05Z) - Multi-intersection Traffic Optimisation: A Benchmark Dataset and a
Strong Baseline [85.9210953301628]
Control of traffic signals is fundamental and critical to alleviate traffic congestion in urban areas.
Because of the high complexity of modelling the problem, experimental settings of current works are often inconsistent.
We propose a novel and strong baseline model based on deep reinforcement learning with the encoder-decoder structure.
arXiv Detail & Related papers (2021-01-24T03:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.