Related papers: Learning to Solve Soft-Constrained Vehicle Routing Problems with Lagrangian Relaxation

Learning to Solve Soft-Constrained Vehicle Routing Problems with Lagrangian Relaxation

URL: http://arxiv.org/abs/2207.09860v2
Date: Thu, 21 Jul 2022 13:11:05 GMT
Title: Learning to Solve Soft-Constrained Vehicle Routing Problems with Lagrangian Relaxation
Authors: Qiaoyue Tang, Yangzhe Kong, Lemeng Pan, Choonmeng Lee
Abstract summary: Vehicle Routing Problems (VRPs) in real-world applications often come with various constraints. We propose a Reinforcement Learning based method to solve soft-constrained VRPs. We apply the method on three common types of VRPs, the Travelling Salesman Problem with Time Windows (TSPTW), the Capacitated VRP (CVRP) and the Capacitated VRP with Time Windows (CVRPTW)
Score: 0.4014524824655105
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Vehicle Routing Problems (VRPs) in real-world applications often come with various constraints, therefore bring additional computational challenges to exact solution methods or heuristic search approaches. The recent idea to learn heuristic move patterns from sample data has become increasingly promising to reduce solution developing costs. However, using learning-based approaches to address more types of constrained VRP remains a challenge. The difficulty lies in controlling for constraint violations while searching for optimal solutions. To overcome this challenge, we propose a Reinforcement Learning based method to solve soft-constrained VRPs by incorporating the Lagrangian relaxation technique and using constrained policy optimization. We apply the method on three common types of VRPs, the Travelling Salesman Problem with Time Windows (TSPTW), the Capacitated VRP (CVRP) and the Capacitated VRP with Time Windows (CVRPTW), to show the generalizability of the proposed method. After comparing to existing RL-based methods and open-source heuristic solvers, we demonstrate its competitive performance in finding solutions with a good balance in travel distance, constraint violations and inference speed.

Related papers

Towards Constraint-Based Adaptive Hypergraph Learning for Solving Vehicle Routing: An End-to-End Solution [4.965709007367529]
Vehicle routing problems are characterized by vast solution spaces and intricate constraints. This study introduces a novel end-to-end framework that combines constraint-oriented hypergraphs with reinforcement learning.
arXiv Detail & Related papers (2025-03-13T14:42:44Z)
Learn to Solve Vehicle Routing Problems ASAP: A Neural Optimization Approach for Time-Constrained Vehicle Routing Problems with Finite Vehicle Fleet [0.0]
We propose an NCO approach to solve a time-constrained capacitated VRP with a finite vehicle fleet size. The method is able to find adequate and cost-efficient solutions, showing both flexibility and robust generalizations.
arXiv Detail & Related papers (2024-11-07T15:16:36Z)
Efficiently Training Deep-Learning Parametric Policies using Lagrangian Duality [55.06411438416805]
Constrained Markov Decision Processes (CMDPs) are critical in many high-stakes applications. This paper introduces a novel approach, Two-Stage Deep Decision Rules (TS- DDR) to efficiently train parametric actor policies. It is shown to enhance solution quality and to reduce computation times by several orders of magnitude when compared to current state-of-the-art methods.
arXiv Detail & Related papers (2024-05-23T18:19:47Z)
Looking Ahead to Avoid Being Late: Solving Hard-Constrained Traveling Salesman Problem [36.88003015541172]
We propose a novel learning-based method that uses looking-ahead information as the feature to improve the legality of TSP with Time Windows (TSPTW) solutions. With comprehensive experiments on diverse datasets, MUSLA outperforms existing baselines and shows generalizability potential.
arXiv Detail & Related papers (2024-03-08T13:49:21Z)
Reinforcement Learning for Solving Stochastic Vehicle Routing Problem with Time Windows [0.09831489366502298]
This paper introduces a reinforcement learning approach to optimize the Vehicle Routing Problem with Time Windows (SVRP) We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows. An attention-based neural network trained through reinforcement learning is employed to minimize routing costs.
arXiv Detail & Related papers (2024-02-15T07:35:29Z)
Reinforcement Learning for Solving Stochastic Vehicle Routing Problem [0.09831489366502298]
This study addresses a gap in the utilization of Reinforcement Learning (RL) and Machine Learning (ML) techniques in solving the Vehicle Routing Problem (SVRP) We propose a novel end-to-end framework that comprehensively addresses the key sources of SVRP and utilizes an RL agent with a simple yet effective architecture and a tailored training method. Our proposed model demonstrates superior performance compared to a widely adopted state-of-the-art meeuristic, achieving a significant 3.43% reduction in travel costs.
arXiv Detail & Related papers (2023-11-13T19:46:22Z)
Roulette-Wheel Selection-Based PSO Algorithm for Solving the Vehicle Routing Problem with Time Windows [58.891409372784516]
This paper presents a novel form of the PSO methodology that uses the Roulette Wheel Method (RWPSO) Experiments using the Solomon VRPTW benchmark datasets on the RWPSO demonstrate that RWPSO is competitive with other state-of-the-art algorithms from the literature.
arXiv Detail & Related papers (2023-06-04T09:18:02Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Sample-Efficient, Exploration-Based Policy Optimisation for Routing Problems [2.6782615615913348]
This paper presents a new reinforcement learning approach that is based on entropy. In addition, we design an off-policy-based reinforcement learning technique that maximises the expected return. We show that our model can generalise to various route problems.
arXiv Detail & Related papers (2022-05-31T09:51:48Z)
An Overview and Experimental Study of Learning-based Optimization Algorithms for Vehicle Routing Problem [49.04543375851723]
Vehicle routing problem (VRP) is a typical discrete optimization problem. Many studies consider learning-based optimization algorithms to solve VRP. This paper reviews recent advances in this field and divides relevant approaches into end-to-end approaches and step-by-step approaches.
arXiv Detail & Related papers (2021-07-15T02:13:03Z)
Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms. D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions. We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO) We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.