Reinforcement Learning for Solving Stochastic Vehicle Routing Problem
- URL: http://arxiv.org/abs/2311.07708v1
- Date: Mon, 13 Nov 2023 19:46:22 GMT
- Title: Reinforcement Learning for Solving Stochastic Vehicle Routing Problem
- Authors: Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac
- Abstract summary: This study addresses a gap in the utilization of Reinforcement Learning (RL) and Machine Learning (ML) techniques in solving the Vehicle Routing Problem (SVRP)
We propose a novel end-to-end framework that comprehensively addresses the key sources of SVRP and utilizes an RL agent with a simple yet effective architecture and a tailored training method.
Our proposed model demonstrates superior performance compared to a widely adopted state-of-the-art meeuristic, achieving a significant 3.43% reduction in travel costs.
- Score: 0.09831489366502298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study addresses a gap in the utilization of Reinforcement Learning (RL)
and Machine Learning (ML) techniques in solving the Stochastic Vehicle Routing
Problem (SVRP) that involves the challenging task of optimizing vehicle routes
under uncertain conditions. We propose a novel end-to-end framework that
comprehensively addresses the key sources of stochasticity in SVRP and utilizes
an RL agent with a simple yet effective architecture and a tailored training
method. Through comparative analysis, our proposed model demonstrates superior
performance compared to a widely adopted state-of-the-art metaheuristic,
achieving a significant 3.43% reduction in travel costs. Furthermore, the model
exhibits robustness across diverse SVRP settings, highlighting its adaptability
and ability to learn optimal routing strategies in varying environments. The
publicly available implementation of our framework serves as a valuable
resource for future research endeavors aimed at advancing RL-based solutions
for SVRP.
Related papers
- Reinforcement Learning for Solving Stochastic Vehicle Routing Problem
with Time Windows [0.09831489366502298]
This paper introduces a reinforcement learning approach to optimize the Vehicle Routing Problem with Time Windows (SVRP)
We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows.
An attention-based neural network trained through reinforcement learning is employed to minimize routing costs.
arXiv Detail & Related papers (2024-02-15T07:35:29Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Learning to Solve Soft-Constrained Vehicle Routing Problems with
Lagrangian Relaxation [0.4014524824655105]
Vehicle Routing Problems (VRPs) in real-world applications often come with various constraints.
We propose a Reinforcement Learning based method to solve soft-constrained VRPs.
We apply the method on three common types of VRPs, the Travelling Salesman Problem with Time Windows (TSPTW), the Capacitated VRP (CVRP) and the Capacitated VRP with Time Windows (CVRPTW)
arXiv Detail & Related papers (2022-07-20T12:51:06Z) - Sample-Efficient, Exploration-Based Policy Optimisation for Routing
Problems [2.6782615615913348]
This paper presents a new reinforcement learning approach that is based on entropy.
In addition, we design an off-policy-based reinforcement learning technique that maximises the expected return.
We show that our model can generalise to various route problems.
arXiv Detail & Related papers (2022-05-31T09:51:48Z) - Pessimistic Model Selection for Offline Deep Reinforcement Learning [56.282483586473816]
Deep Reinforcement Learning (DRL) has demonstrated great potentials in solving sequential decision making problems in many applications.
One main barrier is the over-fitting issue that leads to poor generalizability of the policy learned by DRL.
We propose a pessimistic model selection (PMS) approach for offline DRL with a theoretical guarantee.
arXiv Detail & Related papers (2021-11-29T06:29:49Z) - An Overview and Experimental Study of Learning-based Optimization
Algorithms for Vehicle Routing Problem [49.04543375851723]
Vehicle routing problem (VRP) is a typical discrete optimization problem.
Many studies consider learning-based optimization algorithms to solve VRP.
This paper reviews recent advances in this field and divides relevant approaches into end-to-end approaches and step-by-step approaches.
arXiv Detail & Related papers (2021-07-15T02:13:03Z) - Transferable Deep Reinforcement Learning Framework for Autonomous
Vehicles with Joint Radar-Data Communications [69.24726496448713]
We propose an intelligent optimization framework based on the Markov Decision Process (MDP) to help the AV make optimal decisions.
We then develop an effective learning algorithm leveraging recent advances of deep reinforcement learning techniques to find the optimal policy for the AV.
We show that the proposed transferable deep reinforcement learning framework reduces the obstacle miss detection probability by the AV up to 67% compared to other conventional deep reinforcement learning approaches.
arXiv Detail & Related papers (2021-05-28T08:45:37Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - Learning Vehicle Routing Problems using Policy Optimisation [4.093722933440819]
State-of-the-art approaches learn a policy using reinforcement learning, and the learnt policy acts as a pseudo solver.
These approaches have demonstrated good performance in some cases, but given the large search space typical of routing problem, they can converge too quickly to poor policy.
We propose entropy regularised reinforcement learning (ERRL) that supports exploration by providing more policies.
arXiv Detail & Related papers (2020-12-24T14:18:56Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z) - Reinforcement Learning to Optimize the Logistics Distribution Routes of
Unmanned Aerial Vehicle [0.0]
This paper proposes an improved method to achieve path planning for UAVs in complex surroundings: multiple no-fly zones.
The results show the feasibility and efficiency of the model applying in this kind of complicated situation.
arXiv Detail & Related papers (2020-04-21T09:42:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.