Related papers: Reinforcement Learning for Solving Stochastic Vehicle Routing Problem with Time Windows

Reinforcement Learning for Solving Stochastic Vehicle Routing Problem with Time Windows

URL: http://arxiv.org/abs/2402.09765v1
Date: Thu, 15 Feb 2024 07:35:29 GMT
Title: Reinforcement Learning for Solving Stochastic Vehicle Routing Problem with Time Windows
Authors: Zangir Iklassov and Ikboljon Sobirov and Ruben Solozabal and Martin Takac
Abstract summary: This paper introduces a reinforcement learning approach to optimize the Vehicle Routing Problem with Time Windows (SVRP) We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows. An attention-based neural network trained through reinforcement learning is employed to minimize routing costs.
Score: 0.09831489366502298
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces a reinforcement learning approach to optimize the Stochastic Vehicle Routing Problem with Time Windows (SVRP), focusing on reducing travel costs in goods delivery. We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows. An attention-based neural network trained through reinforcement learning is employed to minimize routing costs. Our approach addresses a gap in SVRP research, which traditionally relies on heuristic methods, by leveraging machine learning. The model outperforms the Ant-Colony Optimization algorithm, achieving a 1.73% reduction in travel costs. It uniquely integrates external information, demonstrating robustness in diverse environments, making it a valuable benchmark for future SVRP studies and industry application.

Related papers

Reinforcement Learning for Solving the Pricing Problem in Column Generation: Applications to Vehicle Routing [5.584238928596282]
We use Reinforcement Learning (RL) to find columns with most negative reduced cost in the Pricing Problem (PP) Our model deploys an end-to-end mechanism as it independently solves the pricing problem without the help of any other mechanism. We show that our method solves the linear relaxation up to a reasonable objective gap within in significantly shorter running times, up to over 300 times faster for instances with 100 customers.
arXiv Detail & Related papers (2025-04-03T08:22:19Z)
CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing [56.98081258047281]
Collaborative Inference with Token-lEvel Routing (CITER) is a framework that enables efficient collaboration between small and large language models. We formulate router training as a policy optimization, where the router receives rewards based on both the quality of predictions and the inference costs of generation. Our experiments show that CITER reduces the inference costs while preserving high-quality generation, offering a promising solution for real-time and resource-constrained applications.
arXiv Detail & Related papers (2025-02-04T03:36:44Z)
Learning for Cross-Layer Resource Allocation in MEC-Aided Cell-Free Networks [71.30914500714262]
Cross-layer resource allocation over mobile edge computing (MEC)-aided cell-free networks can sufficiently exploit the transmitting and computing resources to promote the data rate. Joint subcarrier allocation and beamforming optimization are investigated for the MEC-aided cell-free network from the perspective of deep learning.
arXiv Detail & Related papers (2024-12-21T10:18:55Z)
Reinforcement Learning for Solving Stochastic Vehicle Routing Problem [0.09831489366502298]
This study addresses a gap in the utilization of Reinforcement Learning (RL) and Machine Learning (ML) techniques in solving the Vehicle Routing Problem (SVRP) We propose a novel end-to-end framework that comprehensively addresses the key sources of SVRP and utilizes an RL agent with a simple yet effective architecture and a tailored training method. Our proposed model demonstrates superior performance compared to a widely adopted state-of-the-art meeuristic, achieving a significant 3.43% reduction in travel costs.
arXiv Detail & Related papers (2023-11-13T19:46:22Z)
Optimizing Inventory Routing: A Decision-Focused Learning Approach using Neural Networks [0.0]
We formulate and propose a decision-focused learning-based approach to solving real-world IRPs. This approach directly integrates inventory prediction and routing optimization within an end-to-end system potentially ensuring a robust supply chain strategy.
arXiv Detail & Related papers (2023-11-02T04:05:28Z)
TranDRL: A Transformer-Driven Deep Reinforcement Learning Enabled Prescriptive Maintenance Framework [58.474610046294856]
Industrial systems demand reliable predictive maintenance strategies to enhance operational efficiency and reduce downtime. This paper introduces an integrated framework that leverages the capabilities of the Transformer model-based neural networks and deep reinforcement learning (DRL) algorithms to optimize system maintenance actions.
arXiv Detail & Related papers (2023-09-29T02:27:54Z)
Roulette-Wheel Selection-Based PSO Algorithm for Solving the Vehicle Routing Problem with Time Windows [58.891409372784516]
This paper presents a novel form of the PSO methodology that uses the Roulette Wheel Method (RWPSO) Experiments using the Solomon VRPTW benchmark datasets on the RWPSO demonstrate that RWPSO is competitive with other state-of-the-art algorithms from the literature.
arXiv Detail & Related papers (2023-06-04T09:18:02Z)
TransPath: Learning Heuristics For Grid-Based Pathfinding via Transformers [64.88759709443819]
We suggest learning the instance-dependent proxies that are supposed to notably increase the efficiency of the search. The first proxy we suggest to learn is the correction factor, i.e. the ratio between the instance independent cost-to-go estimate and the perfect one. The second proxy is the path probability, which indicates how likely the grid cell is lying on the shortest path.
arXiv Detail & Related papers (2022-12-22T14:26:11Z)
A deep learning Attention model to solve the Vehicle Routing Problem and the Pick-up and Delivery Problem with Time Windows [0.0]
SNCF, the French public train company, is experimenting to develop new types of transportation services by tackling vehicle routing problems. We use an Attention-Decoder structure and design a novel insertion for the feasibility check of the CPDPTW. Our models yields results that are better than best known learning solutions on the CVRPTW.
arXiv Detail & Related papers (2022-12-20T16:25:55Z)
Actively Learning Costly Reward Functions for Reinforcement Learning [56.34005280792013]
We show that it is possible to train agents in complex real-world environments orders of magnitudes faster. By enabling the application of reinforcement learning methods to new domains, we show that we can find interesting and non-trivial solutions.
arXiv Detail & Related papers (2022-11-23T19:17:20Z)
Learning to Solve Soft-Constrained Vehicle Routing Problems with Lagrangian Relaxation [0.4014524824655105]
Vehicle Routing Problems (VRPs) in real-world applications often come with various constraints. We propose a Reinforcement Learning based method to solve soft-constrained VRPs. We apply the method on three common types of VRPs, the Travelling Salesman Problem with Time Windows (TSPTW), the Capacitated VRP (CVRP) and the Capacitated VRP with Time Windows (CVRPTW)
arXiv Detail & Related papers (2022-07-20T12:51:06Z)
Transferable Deep Reinforcement Learning Framework for Autonomous Vehicles with Joint Radar-Data Communications [69.24726496448713]
We propose an intelligent optimization framework based on the Markov Decision Process (MDP) to help the AV make optimal decisions. We then develop an effective learning algorithm leveraging recent advances of deep reinforcement learning techniques to find the optimal policy for the AV. We show that the proposed transferable deep reinforcement learning framework reduces the obstacle miss detection probability by the AV up to 67% compared to other conventional deep reinforcement learning approaches.
arXiv Detail & Related papers (2021-05-28T08:45:37Z)
Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training. We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)
Multi-Vehicle Routing Problems with Soft Time Windows: A Multi-Agent Reinforcement Learning Approach [9.717648122961483]
Multi-vehicle routing problem with soft time windows (MVRPSTW) is an indispensable constituent in urban logistics systems. Traditional methods incur the dilemma between computational efficiency and solution quality. We propose a novel reinforcement learning algorithm called the Multi-Agent Attention Model that can solve routing problem instantly benefit from lengthy offline training.
arXiv Detail & Related papers (2020-02-13T14:26:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.