Related papers: Pro-Routing: Proactive Routing of Autonomous Multi-Capacity Robots for Pickup-and-Delivery Tasks

Pro-Routing: Proactive Routing of Autonomous Multi-Capacity Robots for Pickup-and-Delivery Tasks

URL: http://arxiv.org/abs/2503.24325v1
Date: Mon, 31 Mar 2025 17:14:07 GMT
Title: Pro-Routing: Proactive Routing of Autonomous Multi-Capacity Robots for Pickup-and-Delivery Tasks
Authors: Daniel Garces, Stephanie Gil,
Abstract summary: We propose a novel proactive rollout-based routing framework that adapts to real-time demand.<n>We derive provable stability guarantees for our method by proposing a fleet sizing algorithm.<n>Our empirical results show that our framework maintains stability when we use the sufficiently large fleet size.
Score: 9.445880844584027
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a multi-robot setting, where we have a fleet of multi-capacity autonomous robots that must service spatially distributed pickup-and-delivery requests with fixed maximum wait times. Requests can be either scheduled ahead of time or they can enter the system in real-time. In this setting, stability for a routing policy is defined as the cost of the policy being uniformly bounded over time. Most previous work either solve the problem offline to theoretically maintain stability or they consider dynamically arriving requests at the expense of the theoretical guarantees on stability. In this paper, we aim to bridge this gap by proposing a novel proactive rollout-based routing framework that adapts to real-time demand while still provably maintaining the stability of the learned routing policy. We derive provable stability guarantees for our method by proposing a fleet sizing algorithm that obtains a sufficiently large fleet that ensures stability by construction. To validate our theoretical results, we consider a case study on real ride requests for Harvard's evening Van System. We also evaluate the performance of our framework using the currently deployed smaller fleet size. In this smaller setup, we compare against the currently deployed routing algorithm, greedy heuristics, and Monte-Carlo-Tree-Search-based algorithms. Our empirical results show that our framework maintains stability when we use the sufficiently large fleet size found in our theoretical results. For the smaller currently deployed fleet size, our method services 6% more requests than the closest baseline while reducing median passenger wait times by 33%.

Related papers

Approximate Multiagent Reinforcement Learning for On-Demand Urban Mobility Problem on a Large Map (extended version) [8.537183852577686]
We study the autonomous multiagent taxi routing problem for a large urban environment.<n>Recent theory has shown that a rollout algorithm with a stable base policy produces a near-optimal stable policy.<n>We propose an approximate multiagent rollout-based two phase algorithm that reduces computational costs, while still achieving a stable near-optimal policy.
arXiv Detail & Related papers (2023-11-02T18:33:32Z)
$\beta$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search [96.99525100285084]
Regularization method, Beta-Decay, is proposed to regularize the DARTS-based NAS searching process (i.e., $beta$-DARTS) In-depth theoretical analyses on how it works and why it works are provided.
arXiv Detail & Related papers (2023-01-16T12:30:32Z)
Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand [1.8505047763172104]
We derive a learning framework to generate routing/pickup policies for a fleet of autonomous vehicles tasked with appearing requests on a city map. We focus on policies that give rise to coordination amongst the vehicles, thereby reducing wait times for servicing requests. We propose a mechanism for switching the originally trained offline approximation when the current demand is outside the original validity region.
arXiv Detail & Related papers (2022-11-28T01:11:11Z)
Scheduling Servers with Stochastic Bilinear Rewards [7.519872646378837]
A system optimization problem arises in multi-class, multi-server queueing system scheduling. We propose a scheduling algorithm based on weighted proportional fair allocation criteria augmented with marginal costs for reward. Our algorithm sub-linear regret and sublinear mean holding cost (and queue length bound) with respect to the time horizon, thus guaranteeing queueing system stability.
arXiv Detail & Related papers (2021-12-13T00:37:20Z)
Estimating the Robustness of Public Transport Systems Using Machine Learning [62.997667081978825]
Planning public transport systems is a highly complex process involving many steps. Integrating robustness from a passenger's point of view makes the task even more challenging. In this paper, we explore a new way of such a scenario-based robustness approximation by using methods from machine learning.
arXiv Detail & Related papers (2021-06-10T05:52:56Z)
A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching. We conduct large scale online A/B tests on DiDi's ride-dispatching platform. Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z)
Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z)
Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement Learning [52.2663102239029]
We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle on idle-hailing platforms. Our approach learns ride-based state-value function using a batch training algorithm with deep value. We benchmark our algorithm with baselines in a ride-hailing simulation environment to demonstrate its superiority in improving income efficiency.
arXiv Detail & Related papers (2021-03-08T05:34:05Z)
Dynamic Resource Management for Providing QoS in Drone Delivery Systems [2.578242050187029]
We study the dynamic UAV assignment problem for a drone delivery system with the goal of providing measurable Quality of Service (QoS) guarantees. We take a deep reinforcement learning approach to obtain a dynamic policy for the re-allocation of the UAVs. We evaluate the performance of our proposed algorithm by considering three broad arrival classes, including Bernoulli, Time-Varying Bernoulli, and Markov-Modulated Bernoulli arrivals.
arXiv Detail & Related papers (2021-03-06T03:11:07Z)
MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch Optimization for Deployment Constrained Reinforcement Learning [108.79676336281211]
Continuous deployment of new policies for data collection and online learning is either cost ineffective or impractical. We propose a new algorithmic learning framework called Model-based Uncertainty regularized and Sample Efficient Batch Optimization. Our framework discovers novel and high quality samples for each deployment to enable efficient data collection.
arXiv Detail & Related papers (2021-02-23T01:30:55Z)
Equilibrium Inverse Reinforcement Learning for Ride-hailing Vehicle Network [1.599072005190786]
We formulate the problem of passenger-vehicle matching in a sparsely connected graph. We propose an algorithm to derive an equilibrium policy in a multi-agent environment.
arXiv Detail & Related papers (2021-02-13T03:18:44Z)
Deep Reinforcement Learning with Robust and Smooth Policy [90.78795857181727]
We propose to learn a smooth policy that behaves smoothly with respect to states. We develop a new framework -- textbfSmooth textbfRegularized textbfReinforcement textbfLearning ($textbfSR2textbfL$), where the policy is trained with smoothness-inducing regularization. Such regularization effectively constrains the search space, and enforces smoothness in the learned policy.
arXiv Detail & Related papers (2020-03-21T00:10:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.