Learning to Solve Soft-Constrained Vehicle Routing Problems with
Lagrangian Relaxation
- URL: http://arxiv.org/abs/2207.09860v2
- Date: Thu, 21 Jul 2022 13:11:05 GMT
- Title: Learning to Solve Soft-Constrained Vehicle Routing Problems with
Lagrangian Relaxation
- Authors: Qiaoyue Tang, Yangzhe Kong, Lemeng Pan, Choonmeng Lee
- Abstract summary: Vehicle Routing Problems (VRPs) in real-world applications often come with various constraints.
We propose a Reinforcement Learning based method to solve soft-constrained VRPs.
We apply the method on three common types of VRPs, the Travelling Salesman Problem with Time Windows (TSPTW), the Capacitated VRP (CVRP) and the Capacitated VRP with Time Windows (CVRPTW)
- Score: 0.4014524824655105
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Vehicle Routing Problems (VRPs) in real-world applications often come with
various constraints, therefore bring additional computational challenges to
exact solution methods or heuristic search approaches. The recent idea to learn
heuristic move patterns from sample data has become increasingly promising to
reduce solution developing costs. However, using learning-based approaches to
address more types of constrained VRP remains a challenge. The difficulty lies
in controlling for constraint violations while searching for optimal solutions.
To overcome this challenge, we propose a Reinforcement Learning based method to
solve soft-constrained VRPs by incorporating the Lagrangian relaxation
technique and using constrained policy optimization. We apply the method on
three common types of VRPs, the Travelling Salesman Problem with Time Windows
(TSPTW), the Capacitated VRP (CVRP) and the Capacitated VRP with Time Windows
(CVRPTW), to show the generalizability of the proposed method. After comparing
to existing RL-based methods and open-source heuristic solvers, we demonstrate
its competitive performance in finding solutions with a good balance in travel
distance, constraint violations and inference speed.
Related papers
- Learn to Solve Vehicle Routing Problems ASAP: A Neural Optimization Approach for Time-Constrained Vehicle Routing Problems with Finite Vehicle Fleet [0.0]
We propose an NCO approach to solve a time-constrained capacitated VRP with a finite vehicle fleet size.
The method is able to find adequate and cost-efficient solutions, showing both flexibility and robust generalizations.
arXiv Detail & Related papers (2024-11-07T15:16:36Z) - Looking Ahead to Avoid Being Late: Solving Hard-Constrained Traveling
Salesman Problem [36.88003015541172]
We propose a novel learning-based method that uses looking-ahead information as the feature to improve the legality of TSP with Time Windows (TSPTW) solutions.
With comprehensive experiments on diverse datasets, MUSLA outperforms existing baselines and shows generalizability potential.
arXiv Detail & Related papers (2024-03-08T13:49:21Z) - Reinforcement Learning for Solving Stochastic Vehicle Routing Problem
with Time Windows [0.09831489366502298]
This paper introduces a reinforcement learning approach to optimize the Vehicle Routing Problem with Time Windows (SVRP)
We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows.
An attention-based neural network trained through reinforcement learning is employed to minimize routing costs.
arXiv Detail & Related papers (2024-02-15T07:35:29Z) - Reinforcement Learning for Solving Stochastic Vehicle Routing Problem [0.09831489366502298]
This study addresses a gap in the utilization of Reinforcement Learning (RL) and Machine Learning (ML) techniques in solving the Vehicle Routing Problem (SVRP)
We propose a novel end-to-end framework that comprehensively addresses the key sources of SVRP and utilizes an RL agent with a simple yet effective architecture and a tailored training method.
Our proposed model demonstrates superior performance compared to a widely adopted state-of-the-art meeuristic, achieving a significant 3.43% reduction in travel costs.
arXiv Detail & Related papers (2023-11-13T19:46:22Z) - Roulette-Wheel Selection-Based PSO Algorithm for Solving the Vehicle
Routing Problem with Time Windows [58.891409372784516]
This paper presents a novel form of the PSO methodology that uses the Roulette Wheel Method (RWPSO)
Experiments using the Solomon VRPTW benchmark datasets on the RWPSO demonstrate that RWPSO is competitive with other state-of-the-art algorithms from the literature.
arXiv Detail & Related papers (2023-06-04T09:18:02Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Sample-Efficient, Exploration-Based Policy Optimisation for Routing
Problems [2.6782615615913348]
This paper presents a new reinforcement learning approach that is based on entropy.
In addition, we design an off-policy-based reinforcement learning technique that maximises the expected return.
We show that our model can generalise to various route problems.
arXiv Detail & Related papers (2022-05-31T09:51:48Z) - An Overview and Experimental Study of Learning-based Optimization
Algorithms for Vehicle Routing Problem [49.04543375851723]
Vehicle routing problem (VRP) is a typical discrete optimization problem.
Many studies consider learning-based optimization algorithms to solve VRP.
This paper reviews recent advances in this field and divides relevant approaches into end-to-end approaches and step-by-step approaches.
arXiv Detail & Related papers (2021-07-15T02:13:03Z) - Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms.
D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions.
We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.