Reinforcement Learning for Solving Stochastic Vehicle Routing Problem
with Time Windows
- URL: http://arxiv.org/abs/2402.09765v1
- Date: Thu, 15 Feb 2024 07:35:29 GMT
- Title: Reinforcement Learning for Solving Stochastic Vehicle Routing Problem
with Time Windows
- Authors: Zangir Iklassov and Ikboljon Sobirov and Ruben Solozabal and Martin
Takac
- Abstract summary: This paper introduces a reinforcement learning approach to optimize the Vehicle Routing Problem with Time Windows (SVRP)
We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows.
An attention-based neural network trained through reinforcement learning is employed to minimize routing costs.
- Score: 0.09831489366502298
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces a reinforcement learning approach to optimize the
Stochastic Vehicle Routing Problem with Time Windows (SVRP), focusing on
reducing travel costs in goods delivery. We develop a novel SVRP formulation
that accounts for uncertain travel costs and demands, alongside specific
customer time windows. An attention-based neural network trained through
reinforcement learning is employed to minimize routing costs. Our approach
addresses a gap in SVRP research, which traditionally relies on heuristic
methods, by leveraging machine learning. The model outperforms the Ant-Colony
Optimization algorithm, achieving a 1.73% reduction in travel costs. It
uniquely integrates external information, demonstrating robustness in diverse
environments, making it a valuable benchmark for future SVRP studies and
industry application.
Related papers
- Reinforcement Learning for Solving Stochastic Vehicle Routing Problem [0.09831489366502298]
This study addresses a gap in the utilization of Reinforcement Learning (RL) and Machine Learning (ML) techniques in solving the Vehicle Routing Problem (SVRP)
We propose a novel end-to-end framework that comprehensively addresses the key sources of SVRP and utilizes an RL agent with a simple yet effective architecture and a tailored training method.
Our proposed model demonstrates superior performance compared to a widely adopted state-of-the-art meeuristic, achieving a significant 3.43% reduction in travel costs.
arXiv Detail & Related papers (2023-11-13T19:46:22Z) - Optimizing Inventory Routing: A Decision-Focused Learning Approach using
Neural Networks [0.0]
We formulate and propose a decision-focused learning-based approach to solving real-world IRPs.
This approach directly integrates inventory prediction and routing optimization within an end-to-end system potentially ensuring a robust supply chain strategy.
arXiv Detail & Related papers (2023-11-02T04:05:28Z) - TranDRL: A Transformer-Driven Deep Reinforcement Learning Enabled Prescriptive Maintenance Framework [58.474610046294856]
Industrial systems demand reliable predictive maintenance strategies to enhance operational efficiency and reduce downtime.
This paper introduces an integrated framework that leverages the capabilities of the Transformer model-based neural networks and deep reinforcement learning (DRL) algorithms to optimize system maintenance actions.
arXiv Detail & Related papers (2023-09-29T02:27:54Z) - Roulette-Wheel Selection-Based PSO Algorithm for Solving the Vehicle
Routing Problem with Time Windows [58.891409372784516]
This paper presents a novel form of the PSO methodology that uses the Roulette Wheel Method (RWPSO)
Experiments using the Solomon VRPTW benchmark datasets on the RWPSO demonstrate that RWPSO is competitive with other state-of-the-art algorithms from the literature.
arXiv Detail & Related papers (2023-06-04T09:18:02Z) - TransPath: Learning Heuristics For Grid-Based Pathfinding via
Transformers [64.88759709443819]
We suggest learning the instance-dependent proxies that are supposed to notably increase the efficiency of the search.
The first proxy we suggest to learn is the correction factor, i.e. the ratio between the instance independent cost-to-go estimate and the perfect one.
The second proxy is the path probability, which indicates how likely the grid cell is lying on the shortest path.
arXiv Detail & Related papers (2022-12-22T14:26:11Z) - A deep learning Attention model to solve the Vehicle Routing Problem and
the Pick-up and Delivery Problem with Time Windows [0.0]
SNCF, the French public train company, is experimenting to develop new types of transportation services by tackling vehicle routing problems.
We use an Attention-Decoder structure and design a novel insertion for the feasibility check of the CPDPTW.
Our models yields results that are better than best known learning solutions on the CVRPTW.
arXiv Detail & Related papers (2022-12-20T16:25:55Z) - Actively Learning Costly Reward Functions for Reinforcement Learning [56.34005280792013]
We show that it is possible to train agents in complex real-world environments orders of magnitudes faster.
By enabling the application of reinforcement learning methods to new domains, we show that we can find interesting and non-trivial solutions.
arXiv Detail & Related papers (2022-11-23T19:17:20Z) - Learning to Solve Soft-Constrained Vehicle Routing Problems with
Lagrangian Relaxation [0.4014524824655105]
Vehicle Routing Problems (VRPs) in real-world applications often come with various constraints.
We propose a Reinforcement Learning based method to solve soft-constrained VRPs.
We apply the method on three common types of VRPs, the Travelling Salesman Problem with Time Windows (TSPTW), the Capacitated VRP (CVRP) and the Capacitated VRP with Time Windows (CVRPTW)
arXiv Detail & Related papers (2022-07-20T12:51:06Z) - Transferable Deep Reinforcement Learning Framework for Autonomous
Vehicles with Joint Radar-Data Communications [69.24726496448713]
We propose an intelligent optimization framework based on the Markov Decision Process (MDP) to help the AV make optimal decisions.
We then develop an effective learning algorithm leveraging recent advances of deep reinforcement learning techniques to find the optimal policy for the AV.
We show that the proposed transferable deep reinforcement learning framework reduces the obstacle miss detection probability by the AV up to 67% compared to other conventional deep reinforcement learning approaches.
arXiv Detail & Related papers (2021-05-28T08:45:37Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z) - Multi-Vehicle Routing Problems with Soft Time Windows: A Multi-Agent
Reinforcement Learning Approach [9.717648122961483]
Multi-vehicle routing problem with soft time windows (MVRPSTW) is an indispensable constituent in urban logistics systems.
Traditional methods incur the dilemma between computational efficiency and solution quality.
We propose a novel reinforcement learning algorithm called the Multi-Agent Attention Model that can solve routing problem instantly benefit from lengthy offline training.
arXiv Detail & Related papers (2020-02-13T14:26:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.