Neural Approximate Dynamic Programming for the Ultra-fast Order
Dispatching Problem
- URL: http://arxiv.org/abs/2311.12975v1
- Date: Tue, 21 Nov 2023 20:23:58 GMT
- Title: Neural Approximate Dynamic Programming for the Ultra-fast Order
Dispatching Problem
- Authors: Arash Dehghan and Mucahit Cevik and Merve Bodur
- Abstract summary: We focus on the ultrafast Order Dispatch Problem (ODP), which involves matching and dispatching orders to couriers within a centralized warehouse setting.
We introduce important extensions to ultra-fast ODP such as order policies and explicit courier assignments to provide a more realistic representation of operations and improve delivery efficiency.
We test our proposed approach using four distinct realistic datasets tailored for ODP and compare the performance of NeurADP against myopic and DRL baselines.
- Score: 1.519321208145928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Same-Day Delivery (SDD) services aim to maximize the fulfillment of online
orders while minimizing delivery delays but are beset by operational
uncertainties such as those in order volumes and courier planning. Our work
aims to enhance the operational efficiency of SDD by focusing on the ultra-fast
Order Dispatching Problem (ODP), which involves matching and dispatching orders
to couriers within a centralized warehouse setting, and completing the delivery
within a strict timeline (e.g., within minutes). We introduce important
extensions to ultra-fast ODP such as order batching and explicit courier
assignments to provide a more realistic representation of dispatching
operations and improve delivery efficiency. As a solution method, we primarily
focus on NeurADP, a methodology that combines Approximate Dynamic Programming
(ADP) and Deep Reinforcement Learning (DRL), and our work constitutes the first
application of NeurADP outside of the ride-pool matching problem. NeurADP is
particularly suitable for ultra-fast ODP as it addresses complex one-to-many
matching and routing intricacies through a neural network-based VFA that
captures high-dimensional problem dynamics without requiring manual feature
engineering as in generic ADP methods. We test our proposed approach using four
distinct realistic datasets tailored for ODP and compare the performance of
NeurADP against myopic and DRL baselines by also making use of non-trivial
bounds to assess the quality of the policies. Our numerical results indicate
that the inclusion of order batching and courier queues enhances the efficiency
of delivery operations and that NeurADP significantly outperforms other
methods. Detailed sensitivity analysis with important parameters confirms the
robustness of NeurADP under different scenarios, including variations in
courier numbers, spatial setup, vehicle capacity, and permitted delay time.
Related papers
- Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation.
In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales.
Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z) - Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations [0.6116681488656472]
This study addresses the dynamic order picking problem.
Traditional methods, often assuming fixed order sets, fall short in this dynamic environment.
We utilize Deep Reinforcement Learning (DRL) as a solution methodology to handle the inherent uncertainties in customer demands.
arXiv Detail & Related papers (2024-08-03T03:56:46Z) - Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers [58.5711048151424]
We introduce SPARSEK Attention, a novel sparse attention mechanism designed to overcome computational and memory obstacles.
Our approach integrates a scoring network and a differentiable top-k mask operator, SPARSEK, to select a constant number of KV pairs for each query.
Experimental results reveal that SPARSEK Attention outperforms previous sparse attention methods.
arXiv Detail & Related papers (2024-06-24T15:55:59Z) - Harvesting Efficient On-Demand Order Pooling from Skilled Couriers: Enhancing Graph Representation Learning for Refining Real-time Many-to-One Assignments [11.0829498096027]
On-demand food delivery (OFD) services offer delivery fulfillment within dozens of minutes after an order is placed.
In OFD, pooling multiple orders for simultaneous delivery in real-time order assignment is a pivotal efficiency source.
The complexity and real-time nature of order assignment, making extensive calculations impractical, significantly limit the potential for order consolidation.
A SC delivery network (SCDN) is constructed, based on an enhanced attributed heterogeneous network embedding approach tailored for OFD.
arXiv Detail & Related papers (2024-06-20T18:03:27Z) - An Efficient Learning-based Solver Comparable to Metaheuristics for the
Capacitated Arc Routing Problem [67.92544792239086]
We introduce an NN-based solver to significantly narrow the gap with advanced metaheuristics.
First, we propose direction-aware facilitating attention model (DaAM) to incorporate directionality into the embedding process.
Second, we design a supervised reinforcement learning scheme that involves supervised pre-training to establish a robust initial policy.
arXiv Detail & Related papers (2024-03-11T02:17:42Z) - Distribution-Aware Continual Test-Time Adaptation for Semantic Segmentation [33.75630514826721]
We propose a distribution-aware tuning ( DAT) method to make semantic segmentation CTTA efficient and practical in real-world applications.
DAT adaptively selects and updates two small groups of trainable parameters based on data distribution during the continual adaptation process.
We conduct experiments on two widely-used semantic segmentation CTTA benchmarks, achieving promising performance compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2023-09-24T10:48:20Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Gradient Sparsification for Efficient Wireless Federated Learning with
Differential Privacy [25.763777765222358]
Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other.
As the model size grows, the training latency due to limited transmission bandwidth and private information degrades while using differential privacy (DP) protection.
We propose sparsification empowered FL framework wireless channels, in over to improve training efficiency without sacrificing convergence performance.
arXiv Detail & Related papers (2023-04-09T05:21:15Z) - Neural Stochastic Dual Dynamic Programming [99.80617899593526]
We introduce a trainable neural model that learns to map problem instances to a piece-wise linear value function.
$nu$-SDDP can significantly reduce problem solving cost without sacrificing solution quality.
arXiv Detail & Related papers (2021-12-01T22:55:23Z) - Deep Reinforcement Learning for Resource Constrained Multiclass
Scheduling in Wireless Networks [0.0]
In our setup, the available limited bandwidth resources are allocated in order to serve randomly arriving service demands.
We propose a distributional Deep Deterministic Policy Gradient (DDPG) algorithm combined with Deep Sets to tackle the problem.
Our proposed algorithm is tested on both synthetic and real data, showing consistent gains against state-of-the-art conventional methods.
arXiv Detail & Related papers (2020-11-27T09:49:38Z) - Optimization-driven Deep Reinforcement Learning for Robust Beamforming
in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver.
We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming.
We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.