Related papers: Neural Approximate Dynamic Programming for the Ultra-fast Order Dispatching Problem

Neural Approximate Dynamic Programming for the Ultra-fast Order Dispatching Problem

URL: http://arxiv.org/abs/2311.12975v1
Date: Tue, 21 Nov 2023 20:23:58 GMT
Title: Neural Approximate Dynamic Programming for the Ultra-fast Order Dispatching Problem
Authors: Arash Dehghan and Mucahit Cevik and Merve Bodur
Abstract summary: We focus on the ultrafast Order Dispatch Problem (ODP), which involves matching and dispatching orders to couriers within a centralized warehouse setting. We introduce important extensions to ultra-fast ODP such as order policies and explicit courier assignments to provide a more realistic representation of operations and improve delivery efficiency. We test our proposed approach using four distinct realistic datasets tailored for ODP and compare the performance of NeurADP against myopic and DRL baselines.
Score: 1.519321208145928
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Same-Day Delivery (SDD) services aim to maximize the fulfillment of online orders while minimizing delivery delays but are beset by operational uncertainties such as those in order volumes and courier planning. Our work aims to enhance the operational efficiency of SDD by focusing on the ultra-fast Order Dispatching Problem (ODP), which involves matching and dispatching orders to couriers within a centralized warehouse setting, and completing the delivery within a strict timeline (e.g., within minutes). We introduce important extensions to ultra-fast ODP such as order batching and explicit courier assignments to provide a more realistic representation of dispatching operations and improve delivery efficiency. As a solution method, we primarily focus on NeurADP, a methodology that combines Approximate Dynamic Programming (ADP) and Deep Reinforcement Learning (DRL), and our work constitutes the first application of NeurADP outside of the ride-pool matching problem. NeurADP is particularly suitable for ultra-fast ODP as it addresses complex one-to-many matching and routing intricacies through a neural network-based VFA that captures high-dimensional problem dynamics without requiring manual feature engineering as in generic ADP methods. We test our proposed approach using four distinct realistic datasets tailored for ODP and compare the performance of NeurADP against myopic and DRL baselines by also making use of non-trivial bounds to assess the quality of the policies. Our numerical results indicate that the inclusion of order batching and courier queues enhances the efficiency of delivery operations and that NeurADP significantly outperforms other methods. Detailed sensitivity analysis with important parameters confirms the robustness of NeurADP under different scenarios, including variations in courier numbers, spatial setup, vehicle capacity, and permitted delay time.

Related papers

Learning to Solve the Min-Max Mixed-Shelves Picker-Routing Problem via Hierarchical and Parallel Decoding [0.3867363075280544]
The Mixed-Shelves Picker Routing Problem (MSPRP) is a fundamental challenge in logistics, where pickers must navigate a mixed-shelves environment to retrieve SKUs efficiently. We propose a novel hierarchical and parallel decoding approach for solving the min-max variant of the MSPRP via multi-agent reinforcement learning. Experiments show state-of-the-art performance in both solution quality and inference speed, particularly for large-scale and out-of-distribution instances.
arXiv Detail & Related papers (2025-02-14T15:42:30Z)
Dependency-Aware CAV Task Scheduling via Diffusion-Based Reinforcement Learning [12.504232513881828]
We propose a novel dependency-aware task scheduling strategy for dynamic unmanned aerial vehicle-assisted connected autonomous vehicles (CAVs) We formulate a joint scheduling priority and subtask assignment optimization problem with the objective of minimizing the average task completion time. We propose a diffusion-based reinforcement learning algorithm, named Synthetic DDQN based Subtasks Scheduling, which can make adaptive task scheduling decision in real time.
arXiv Detail & Related papers (2024-11-27T11:07:31Z)
SCoTT: Wireless-Aware Path Planning with Vision Language Models and Strategic Chains-of-Thought [78.53885607559958]
A novel approach using vision language models (VLMs) is proposed for enabling path planning in complex wireless-aware environments. To this end, insights from a digital twin with real-world wireless ray tracing data are explored. Results show that SCoTT achieves very close average path gains compared to DP-WA* while at the same time yielding consistently shorter path lengths.
arXiv Detail & Related papers (2024-11-27T10:45:49Z)
Online Parallel Multi-Task Relationship Learning via Alternating Direction Method of Multipliers [37.859185005986056]
Online multi-task learning (OMTL) enhances streaming data processing by leveraging the inherent relations among multiple tasks. This study proposes a novel OMTL framework based on the alternating direction multiplier method (ADMM), a recent breakthrough in optimization suitable for the distributed computing environment.
arXiv Detail & Related papers (2024-11-09T10:20:13Z)
Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation. In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales. Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z)
Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations [0.6116681488656472]
This study addresses the dynamic order picking problem. Traditional methods, often assuming fixed order sets, fall short in this dynamic environment. We utilize Deep Reinforcement Learning (DRL) as a solution methodology to handle the inherent uncertainties in customer demands.
arXiv Detail & Related papers (2024-08-03T03:56:46Z)
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers [58.5711048151424]
We introduce SPARSEK Attention, a novel sparse attention mechanism designed to overcome computational and memory obstacles. Our approach integrates a scoring network and a differentiable top-k mask operator, SPARSEK, to select a constant number of KV pairs for each query. Experimental results reveal that SPARSEK Attention outperforms previous sparse attention methods.
arXiv Detail & Related papers (2024-06-24T15:55:59Z)
Harvesting Efficient On-Demand Order Pooling from Skilled Couriers: Enhancing Graph Representation Learning for Refining Real-time Many-to-One Assignments [11.0829498096027]
On-demand food delivery (OFD) services offer delivery fulfillment within dozens of minutes after an order is placed. In OFD, pooling multiple orders for simultaneous delivery in real-time order assignment is a pivotal efficiency source. The complexity and real-time nature of order assignment, making extensive calculations impractical, significantly limit the potential for order consolidation. A SC delivery network (SCDN) is constructed, based on an enhanced attributed heterogeneous network embedding approach tailored for OFD.
arXiv Detail & Related papers (2024-06-20T18:03:27Z)
An Efficient Learning-based Solver Comparable to Metaheuristics for the Capacitated Arc Routing Problem [67.92544792239086]
We introduce an NN-based solver to significantly narrow the gap with advanced metaheuristics. First, we propose direction-aware facilitating attention model (DaAM) to incorporate directionality into the embedding process. Second, we design a supervised reinforcement learning scheme that involves supervised pre-training to establish a robust initial policy.
arXiv Detail & Related papers (2024-03-11T02:17:42Z)
Distribution-Aware Continual Test-Time Adaptation for Semantic Segmentation [33.75630514826721]
We propose a distribution-aware tuning ( DAT) method to make semantic segmentation CTTA efficient and practical in real-world applications. DAT adaptively selects and updates two small groups of trainable parameters based on data distribution during the continual adaptation process. We conduct experiments on two widely-used semantic segmentation CTTA benchmarks, achieving promising performance compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2023-09-24T10:48:20Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Gradient Sparsification for Efficient Wireless Federated Learning with Differential Privacy [25.763777765222358]
Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other. As the model size grows, the training latency due to limited transmission bandwidth and private information degrades while using differential privacy (DP) protection. We propose sparsification empowered FL framework wireless channels, in over to improve training efficiency without sacrificing convergence performance.
arXiv Detail & Related papers (2023-04-09T05:21:15Z)
Neural Stochastic Dual Dynamic Programming [99.80617899593526]
We introduce a trainable neural model that learns to map problem instances to a piece-wise linear value function. $nu$-SDDP can significantly reduce problem solving cost without sacrificing solution quality.
arXiv Detail & Related papers (2021-12-01T22:55:23Z)
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.