Related papers: Non-myopic Matching and Rebalancing in Large-Scale On-Demand Ride-Pooling Systems Using Simulation-Informed Reinforcement Learning

Non-myopic Matching and Rebalancing in Large-Scale On-Demand Ride-Pooling Systems Using Simulation-Informed Reinforcement Learning

URL: http://arxiv.org/abs/2510.25796v1
Date: Tue, 28 Oct 2025 23:21:27 GMT
Title: Non-myopic Matching and Rebalancing in Large-Scale On-Demand Ride-Pooling Systems Using Simulation-Informed Reinforcement Learning
Authors: Farnoosh Namdarpour, Joseph Y. J. Chow,
Abstract summary: Ride-pooling, also known as ride-hailing, shared ride-sharing, or microtransit, is a service wherein passengers share rides.<n>A key limitation, however, is its myopic decision-making which overlooks long-term effects of dispatch decisions.<n>We propose a simulation-informed reinforcement learning (RL) approach to address this.
Score: 1.7403133838762448
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Ride-pooling, also known as ride-sharing, shared ride-hailing, or microtransit, is a service wherein passengers share rides. This service can reduce costs for both passengers and operators and reduce congestion and environmental impacts. A key limitation, however, is its myopic decision-making, which overlooks long-term effects of dispatch decisions. To address this, we propose a simulation-informed reinforcement learning (RL) approach. While RL has been widely studied in the context of ride-hailing systems, its application in ride-pooling systems has been less explored. In this study, we extend the learning and planning framework of Xu et al. (2018) from ride-hailing to ride-pooling by embedding a ride-pooling simulation within the learning mechanism to enable non-myopic decision-making. In addition, we propose a complementary policy for rebalancing idle vehicles. By employing n-step temporal difference learning on simulated experiences, we derive spatiotemporal state values and subsequently evaluate the effectiveness of the non-myopic policy using NYC taxi request data. Results demonstrate that the non-myopic policy for matching can increase the service rate by up to 8.4% versus a myopic policy while reducing both in-vehicle and wait times for passengers. Furthermore, the proposed non-myopic policy can decrease fleet size by over 25% compared to a myopic policy, while maintaining the same level of performance, thereby offering significant cost savings for operators. Incorporating rebalancing operations into the proposed framework cuts wait time by up to 27.3%, in-vehicle time by 12.5%, and raises service rate by 15.1% compared to using the framework for matching decisions alone at the cost of increased vehicle minutes traveled per passenger.

Related papers

Assessing On-Demand Mobility Services and Policy Impacts: A Case Study from Chengdu, China [3.8367373028524874]
This study integrates a graph theory-based trip-vehicle matching mechanism with real cruising taxi operations data to simulate ride-hailing services in Chengdu, China.<n>We examine the impacts of fleet size management, geofencing, and demand management, on the performance of ride-hailing services.
arXiv Detail & Related papers (2025-11-08T17:08:07Z)
Timing the Match: A Deep Reinforcement Learning Approach for Ride-Hailing and Ride-Pooling Services [17.143444035884386]
We propose an adaptive ride-matching strategy using deep reinforcement learning (RL) to determine when to perform matches based on real-time system conditions.<n>Our method continuously evaluates system states and executes matching at moments that minimize total passenger wait time.
arXiv Detail & Related papers (2025-03-17T14:07:58Z)
Fairness-Enhancing Vehicle Rebalancing in the Ride-hailing System [7.531863938542706]
The rapid growth of the ride-hailing industry has revolutionized urban transportation worldwide. Despite its benefits, equity concerns arise as underserved communities face limited accessibility to affordable ride-hailing services. This paper focuses on enhancing both algorithmic and rider fairness through a novel vehicle rebalancing method.
arXiv Detail & Related papers (2023-12-29T23:02:34Z)
Fair collaborative vehicle routing: A deep multi-agent reinforcement learning approach [49.00137468773683]
Collaborative vehicle routing occurs when carriers collaborate through sharing their transportation requests and performing transportation requests on behalf of each other. Traditional game theoretic solution concepts are expensive to calculate as the characteristic function scales exponentially with the number of agents. We propose to model this problem as a coalitional bargaining game solved using deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2023-10-26T15:42:29Z)
Coalitional Bargaining via Reinforcement Learning: An Application to Collaborative Vehicle Routing [49.00137468773683]
Collaborative Vehicle Routing is where delivery companies cooperate by sharing their delivery information and performing delivery requests on behalf of each other. This achieves economies of scale and thus reduces cost, greenhouse gas emissions, and road congestion. But which company should partner with whom, and how much should each company be compensated? Traditional game theoretic solution concepts, such as the Shapley value or nucleolus, are difficult to calculate for the real-world problem of Collaborative Vehicle Routing.
arXiv Detail & Related papers (2023-10-26T15:04:23Z)
Studying the Impact of Semi-Cooperative Drivers on Overall Highway Flow [76.38515853201116]
Semi-cooperative behaviors are intrinsic properties of human drivers and should be considered for autonomous driving. New autonomous planners can consider the social value orientation (SVO) of human drivers to generate socially-compliant trajectories. We present study of implicit semi-cooperative driving where agents deploy a game-theoretic version of iterative best response.
arXiv Detail & Related papers (2023-04-23T16:01:36Z)
Improving Operational Efficiency In EV Ridepooling Fleets By Predictive Exploitation of Idle Times [0.0]
We present a real-time predictive charging method for ridepooling services with a single operator, called Idle Time Exploitation (ITX) ITX predicts the periods where vehicles are idle and exploits these periods to harvest energy. It relies on Graph Convolutional Networks and a linear assignment algorithm to devise an optimal pairing of vehicles and charging stations.
arXiv Detail & Related papers (2022-08-30T08:41:40Z)
Efficiency, Fairness, and Stability in Non-Commercial Peer-to-Peer Ridesharing [84.47891614815325]
This paper focuses on the core problem in P2P ridesharing: the matching of riders and drivers. We introduce novel notions of fairness and stability in P2P ridesharing. Results suggest that fair and stable solutions can be obtained in reasonable computational times.
arXiv Detail & Related papers (2021-10-04T02:14:49Z)
Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement Learning [52.2663102239029]
We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle on idle-hailing platforms. Our approach learns ride-based state-value function using a batch training algorithm with deep value. We benchmark our algorithm with baselines in a ride-hailing simulation environment to demonstrate its superiority in improving income efficiency.
arXiv Detail & Related papers (2021-03-08T05:34:05Z)
Equilibrium Inverse Reinforcement Learning for Ride-hailing Vehicle Network [1.599072005190786]
We formulate the problem of passenger-vehicle matching in a sparsely connected graph. We propose an algorithm to derive an equilibrium policy in a multi-agent environment.
arXiv Detail & Related papers (2021-02-13T03:18:44Z)
Vehicular Cooperative Perception Through Action Branching and Federated Reinforcement Learning [101.64598586454571]
A novel framework is proposed to allow reinforcement learning-based vehicular association, resource block (RB) allocation, and content selection of cooperative perception messages (CPMs) A federated RL approach is introduced in order to speed up the training process across vehicles. Results show that federated RL improves the training process, where better policies can be achieved within the same amount of time compared to the non-federated approach.
arXiv Detail & Related papers (2020-12-07T02:09:15Z)
Real-time and Large-scale Fleet Allocation of Autonomous Taxis: A Case Study in New York Manhattan Island [14.501650948647324]
Traditional models fail to efficiently allocate the available fleet to deal with the imbalance of supply (autonomous taxis) and demand (trips) We employ a Constrained Multi-agent Markov Decision Processes (CMMDP) to model fleet allocation decisions. We also leverage a Column Generation algorithm to guarantee the efficiency and optimality in a large scale.
arXiv Detail & Related papers (2020-09-06T16:00:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.