Related papers: Value Function is All You Need: A Unified Learning Framework for Ride Hailing Platforms

Value Function is All You Need: A Unified Learning Framework for Ride Hailing Platforms

URL: http://arxiv.org/abs/2105.08791v2
Date: Thu, 20 May 2021 01:04:34 GMT
Title: Value Function is All You Need: A Unified Learning Framework for Ride Hailing Platforms
Authors: Xiaocheng Tang, Fan Zhang, Zhiwei Qin, Yansheng Wang, Dingyuan Shi, Bingchen Song, Yongxin Tong, Hongtu Zhu, Jieping Ye
Abstract summary: Large ride-hailing platforms, such as DiDi, Uber and Lyft, connect tens of thousands of vehicles in a city to millions of ride demands throughout the day. We propose a unified value-based dynamic learning framework (V1D3) for tackling both tasks.
Score: 57.21078336887961
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large ride-hailing platforms, such as DiDi, Uber and Lyft, connect tens of thousands of vehicles in a city to millions of ride demands throughout the day, providing great promises for improving transportation efficiency through the tasks of order dispatching and vehicle repositioning. Existing studies, however, usually consider the two tasks in simplified settings that hardly address the complex interactions between the two, the real-time fluctuations between supply and demand, and the necessary coordinations due to the large-scale nature of the problem. In this paper we propose a unified value-based dynamic learning framework (V1D3) for tackling both tasks. At the center of the framework is a globally shared value function that is updated continuously using online experiences generated from real-time platform transactions. To improve the sample-efficiency and the robustness, we further propose a novel periodic ensemble method combining the fast online learning with a large-scale offline training scheme that leverages the abundant historical driver trajectory data. This allows the proposed framework to adapt quickly to the highly dynamic environment, to generalize robustly to recurrent patterns and to drive implicit coordinations among the population of managed vehicles. Extensive experiments based on real-world datasets show considerably improvements over other recently proposed methods on both tasks. Particularly, V1D3 outperforms the first prize winners of both dispatching and repositioning tracks in the KDD Cup 2020 RL competition, achieving state-of-the-art results on improving both total driver income and user experience related metrics.

Related papers

World Model-Based Learning for Long-Term Age of Information Minimization in Vehicular Networks [53.98633183204453]
In this paper, a novel world model-based learning framework is proposed to minimize packet-completeness-aware age of information (CAoI) in a vehicular network.<n>A world model framework is proposed to jointly learn a dynamic model of the mmWave V2X environment and use it to imagine trajectories for learning how to perform link scheduling.<n>In particular, the long-term policy is learned in differentiable imagined trajectories instead of environment interactions.
arXiv Detail & Related papers (2025-05-03T06:23:18Z)
Accelerating Vehicle Routing via AI-Initialized Genetic Algorithms [55.78505925402658]
Vehicle Routing Problems (VRP) are an extension of the Traveling Salesperson Problem and are a fundamental NP-hard challenge in Evolutionary optimization. We introduce a novel optimization framework that uses a reinforcement learning agent - trained on prior instances - to quickly generate initial solutions, which are then further optimized by genetic algorithms. For example, EARLI handles vehicle routing with 500 locations within 1s, 10x faster than current solvers for the same solution quality, enabling applications like real-time and interactive routing.
arXiv Detail & Related papers (2025-04-08T15:21:01Z)
DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving [62.62464518137153]
DriveTransformer is a simplified E2E-AD framework for the ease of scaling up. It is composed of three unified operations: task self-attention, sensor cross-attention, temporal cross-attention. It achieves state-of-the-art performance in both simulated closed-loop benchmark Bench2Drive and real world open-loop benchmark nuScenes with high FPS.
arXiv Detail & Related papers (2025-03-07T11:41:18Z)
HEART: Achieving Timely Multi-Model Training for Vehicle-Edge-Cloud-Integrated Hierarchical Federated Learning [30.75025062952915]
The rapid growth of AI-enabled Internet of Vehicles (IoV) calls for efficient machine learning (ML) solutions. Vehicles often need to execute multiple ML tasks simultaneously, where this multi-model training environment introduces crucial challenges. We propose a framework for multi-model training in dynamic VEC-HFL with the goal of minimizing global training latency.
arXiv Detail & Related papers (2025-01-17T03:15:03Z)
From Imitation to Exploration: End-to-end Autonomous Driving based on World Model [24.578178308010912]
RAMBLE is an end-to-end world model-based RL method for driving decision-making. It can handle complex and dynamic traffic scenarios. It achieves state-of-the-art performance in route completion rate on the CARLA Leaderboard 1.0 and completes all 38 scenarios on the CARLA Leaderboard 2.0.
arXiv Detail & Related papers (2024-10-03T06:45:59Z)
CoMamba: Real-time Cooperative Perception Unlocked with State Space Models [39.87600356189242]
CoMamba is a novel cooperative 3D detection framework designed to leverage state-space models for real-time onboard vehicle perception. CoMamba achieves superior performance compared to existing methods while maintaining real-time processing capabilities.
arXiv Detail & Related papers (2024-09-16T20:02:19Z)
Importance Sampling-Guided Meta-Training for Intelligent Agents in Highly Interactive Environments [43.144056801987595]
This study introduces a novel training framework that integrates guided meta RL with importance sampling (IS) to optimize training distributions. By estimating a naturalistic distribution from real-world datasets, the framework ensures a balanced focus across common and extreme driving scenarios.
arXiv Detail & Related papers (2024-07-22T17:57:12Z)
Dual Policy Reinforcement Learning for Real-time Rebalancing in Bike-sharing Systems [13.083156894368532]
Bike-sharing systems play a crucial role in easing traffic congestion and promoting healthier lifestyles. This study introduces a novel approach to address the real-time rebalancing problem with a fleet of vehicles. It employs a dual policy reinforcement learning algorithm that decouples inventory and routing decisions.
arXiv Detail & Related papers (2024-06-02T21:05:23Z)
A Reinforcement Learning Approach for Dynamic Rebalancing in Bike-Sharing System [11.237099288412558]
Bike-Sharing Systems provide eco-friendly urban mobility, contributing to the alleviation of traffic congestion and healthier lifestyles. Devising effective rebalancing strategies using vehicles to redistribute bikes among stations is therefore of uttermost importance for operators. This paper introduces atemporal reinforcement learning algorithm for the dynamic rebalancing problem with multiple vehicles.
arXiv Detail & Related papers (2024-02-05T23:46:42Z)
Combinatorial Optimization enriched Machine Learning to solve the Dynamic Vehicle Routing Problem with Time Windows [5.4807970361321585]
We propose a novel machine learning pipeline that incorporates an optimization layer. We apply this pipeline to a dynamic vehicle routing problem with waves, which was recently promoted in the EURO Meets NeurIPS Competition at NeurIPS 2022. Our methodology ranked first in this competition, outperforming all other approaches in solving the proposed dynamic vehicle routing problem.
arXiv Detail & Related papers (2023-04-03T08:23:09Z)
Towards Scale Consistent Monocular Visual Odometry by Learning from the Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data. We first train a scale-aware disparity network using both monocular real images and stereo virtual data. The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z)
A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching. We conduct large scale online A/B tests on DiDi's ride-dispatching platform. Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z)
Flatland Competition 2020: MAPF and MARL for Efficient Train Coordination on a Grid World [49.80905654161763]
The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP) The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur. The ever-growing complexity of modern railway networks makes dynamic real-time scheduling of traffic virtually impossible.
arXiv Detail & Related papers (2021-03-30T17:13:29Z)
Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement Learning [52.2663102239029]
We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle on idle-hailing platforms. Our approach learns ride-based state-value function using a batch training algorithm with deep value. We benchmark our algorithm with baselines in a ride-hailing simulation environment to demonstrate its superiority in improving income efficiency.
arXiv Detail & Related papers (2021-03-08T05:34:05Z)
Multi-intersection Traffic Optimisation: A Benchmark Dataset and a Strong Baseline [85.9210953301628]
Control of traffic signals is fundamental and critical to alleviate traffic congestion in urban areas. Because of the high complexity of modelling the problem, experimental settings of current works are often inconsistent. We propose a novel and strong baseline model based on deep reinforcement learning with the encoder-decoder structure.
arXiv Detail & Related papers (2021-01-24T03:55:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.