Deep Reinforcement Learning for Crowdsourced Urban Delivery: System
States Characterization, Heuristics-guided Action Choice, and
Rule-Interposing Integration
- URL: http://arxiv.org/abs/2011.14430v1
- Date: Sun, 29 Nov 2020 19:50:34 GMT
- Title: Deep Reinforcement Learning for Crowdsourced Urban Delivery: System
States Characterization, Heuristics-guided Action Choice, and
Rule-Interposing Integration
- Authors: Tanvir Ahamed, Bo Zou, Nahid Parvez Farazi and Theja Tulabandhula
- Abstract summary: This paper investigates the problem of assigning shipping requests to ad hoc couriers in the context of crowdsourced urban delivery.
We propose a new deep reinforcement learning (DRL)-based approach to tackling this assignment problem.
- Score: 0.8099700053397277
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the problem of assigning shipping requests to ad hoc
couriers in the context of crowdsourced urban delivery. The shipping requests
are spatially distributed each with a limited time window between the earliest
time for pickup and latest time for delivery. The ad hoc couriers, termed
crowdsourcees, also have limited time availability and carrying capacity. We
propose a new deep reinforcement learning (DRL)-based approach to tackling this
assignment problem. A deep Q network (DQN) algorithm is trained which entails
two salient features of experience replay and target network that enhance the
efficiency, convergence, and stability of DRL training. More importantly, this
paper makes three methodological contributions: 1) presenting a comprehensive
and novel characterization of crowdshipping system states that encompasses
spatial-temporal and capacity information of crowdsourcees and requests; 2)
embedding heuristics that leverage the information offered by the state
representation and are based on intuitive reasoning to guide specific actions
to take, to preserve tractability and enhance efficiency of training; and 3)
integrating rule-interposing to prevent repeated visiting of the same routes
and node sequences during routing improvement, thereby further enhancing the
training efficiency by accelerating learning. The effectiveness of the proposed
approach is demonstrated through extensive numerical analysis. The results show
the benefits brought by the heuristics-guided action choice and
rule-interposing in DRL training, and the superiority of the proposed approach
over existing heuristics in both solution quality, time, and scalability.
Besides the potential to improve the efficiency of crowdshipping operation
planning, the proposed approach also provides a new avenue and generic
framework for other problems in the vehicle routing context.
Related papers
- Real-Time Integrated Dispatching and Idle Fleet Steering with Deep Reinforcement Learning for A Meal Delivery Platform [0.0]
This study sets out to solve the real-time order dispatching and idle courier steering problems for a meal delivery platform.
We propose a reinforcement learning (RL)-based strategic dual-control framework.
We find the delivery efficiency and fairness of workload distribution among couriers have been improved.
arXiv Detail & Related papers (2025-01-10T09:15:40Z) - Preventing Local Pitfalls in Vector Quantization via Optimal Transport [77.15924044466976]
We introduce OptVQ, a novel vector quantization method that employs the Sinkhorn algorithm to optimize the optimal transport problem.
Our experiments on image reconstruction tasks demonstrate that OptVQ achieves 100% codebook utilization and surpasses current state-of-the-art VQNs in reconstruction quality.
arXiv Detail & Related papers (2024-12-19T18:58:14Z) - Toward Enhanced Reinforcement Learning-Based Resource Management via Digital Twin: Opportunities, Applications, and Challenges [40.73920295596231]
This article presents a digital twin (DT)-enhanced reinforcement learning (RL) framework aimed at optimizing performance and reliability in network resource management.
To deal with the challenges, a comprehensive DT-based framework is proposed to enhance the convergence speed and performance for unified RL-based resource management.
The proposed framework provides safe action exploration, more accurate estimates of long-term returns, faster training convergence, higher convergence performance, and real-time adaptation to varying network conditions.
arXiv Detail & Related papers (2024-06-12T04:14:24Z) - Dual Policy Reinforcement Learning for Real-time Rebalancing in Bike-sharing Systems [13.083156894368532]
Bike-sharing systems play a crucial role in easing traffic congestion and promoting healthier lifestyles.
This study introduces a novel approach to address the real-time rebalancing problem with a fleet of vehicles.
It employs a dual policy reinforcement learning algorithm that decouples inventory and routing decisions.
arXiv Detail & Related papers (2024-06-02T21:05:23Z) - An Efficient Learning-based Solver Comparable to Metaheuristics for the
Capacitated Arc Routing Problem [67.92544792239086]
We introduce an NN-based solver to significantly narrow the gap with advanced metaheuristics.
First, we propose direction-aware facilitating attention model (DaAM) to incorporate directionality into the embedding process.
Second, we design a supervised reinforcement learning scheme that involves supervised pre-training to establish a robust initial policy.
arXiv Detail & Related papers (2024-03-11T02:17:42Z) - Temporal Transfer Learning for Traffic Optimization with Coarse-grained Advisory Autonomy [4.809821883560606]
This paper explores advisory autonomy, in which real-time driving advisories are issued to the human drivers.
We introduce Temporal Transfer Learning (TTL) algorithms to select source tasks for zero-shot transfer.
We validate our algorithms on diverse mixed-traffic scenarios.
arXiv Detail & Related papers (2023-11-27T21:18:06Z) - DL-DRL: A double-level deep reinforcement learning approach for
large-scale task scheduling of multi-UAV [65.07776277630228]
We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF)
Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs.
We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
arXiv Detail & Related papers (2022-08-04T04:35:53Z) - Reinforcement Learning in the Wild: Scalable RL Dispatching Algorithm
Deployed in Ridehailing Marketplace [12.298997392937876]
This study proposes a real-time dispatching algorithm based on reinforcement learning.
It is deployed online in multiple cities under DiDi's operation for A/B testing and is launched in one of the major international markets.
The deployed algorithm shows over 1.3% improvement in total driver income from A/B testing.
arXiv Detail & Related papers (2022-02-10T16:07:17Z) - Path Design and Resource Management for NOMA enhanced Indoor Intelligent
Robots [58.980293789967575]
A communication enabled indoor intelligent robots (IRs) service framework is proposed.
Lego modeling method is proposed, which can deterministically describe the indoor layout and channel state.
The investigated radio map is invoked as a virtual environment to train the reinforcement learning agent.
arXiv Detail & Related papers (2020-11-23T21:45:01Z) - Language-guided Navigation via Cross-Modal Grounding and Alternate
Adversarial Learning [66.9937776799536]
The emerging vision-and-language navigation (VLN) problem aims at learning to navigate an agent to the target location in unseen photo-realistic environments.
The main challenges of VLN arise mainly from two aspects: first, the agent needs to attend to the meaningful paragraphs of the language instruction corresponding to the dynamically-varying visual environments.
We propose a cross-modal grounding module to equip the agent with a better ability to track the correspondence between the textual and visual modalities.
arXiv Detail & Related papers (2020-11-22T09:13:46Z) - Adaptive Serverless Learning [114.36410688552579]
We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically.
Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers.
To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
arXiv Detail & Related papers (2020-08-24T13:23:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.