Solving the vehicle routing problem with deep reinforcement learning
- URL: http://arxiv.org/abs/2208.00202v1
- Date: Sat, 30 Jul 2022 12:34:26 GMT
- Title: Solving the vehicle routing problem with deep reinforcement learning
- Authors: Simone Foa and Corrado Coppola and Giorgio Grani and Laura Palagi
- Abstract summary: This paper focuses on the application of RL for the Vehicle Routing Problem (VRP), a famous problem that belongs to the class of NP-Hard problems.
In a second phase, the neural architecture behind the Actor and Critic has been established, choosing to adopt a neural architecture based on the Convolutional neural networks.
Experiments performed on a wide range of instances show that the algorithm has good generalization capabilities and can reach good solutions in a short time.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, the applications of the methodologies of Reinforcement Learning
(RL) to NP-Hard Combinatorial optimization problems have become a popular
topic. This is essentially due to the nature of the traditional combinatorial
algorithms, often based on a trial-and-error process. RL aims at automating
this process. At this regard, this paper focuses on the application of RL for
the Vehicle Routing Problem (VRP), a famous combinatorial problem that belongs
to the class of NP-Hard problems. In this work, first, the problem is modeled
as a Markov Decision Process (MDP) and then the PPO method (which belongs to
the Actor-Critic class of Reinforcement learning methods) is applied. In a
second phase, the neural architecture behind the Actor and Critic has been
established, choosing to adopt a neural architecture based on the Convolutional
neural networks, both for the Actor and the Critic. This choice resulted in
effectively addressing problems of different sizes. Experiments performed on a
wide range of instances show that the algorithm has good generalization
capabilities and can reach good solutions in a short time. Comparisons between
the algorithm proposed and the state-of-the-art solver OR-TOOLS show that the
latter still outperforms the Reinforcement learning algorithm. However, there
are future research perspectives, that aim to upgrade the current performance
of the algorithm proposed.
Related papers
- An Efficient Learning-based Solver Comparable to Metaheuristics for the
Capacitated Arc Routing Problem [67.92544792239086]
We introduce an NN-based solver to significantly narrow the gap with advanced metaheuristics.
First, we propose direction-aware facilitating attention model (DaAM) to incorporate directionality into the embedding process.
Second, we design a supervised reinforcement learning scheme that involves supervised pre-training to establish a robust initial policy.
arXiv Detail & Related papers (2024-03-11T02:17:42Z) - Neural Algorithmic Reasoning Without Intermediate Supervision [21.852775399735005]
We focus on learning neural algorithmic reasoning only from the input-output pairs without appealing to the intermediate supervision.
We build a self-supervised objective that can regularise intermediate computations of the model without access to the algorithm trajectory.
We demonstrate that our approach is competitive to its trajectory-supervised counterpart on tasks from the CLRSic Algorithmic Reasoning Benchmark.
arXiv Detail & Related papers (2023-06-23T09:57:44Z) - Neural Algorithmic Reasoning for Combinatorial Optimisation [20.36694807847833]
We propose leveraging recent advancements in neural reasoning to improve the learning of CO problems.
We suggest pre-training our neural model on relevant algorithms before training it on CO instances.
Our results demonstrate that by using this learning setup, we achieve superior performance compared to non-algorithmically informed deep learning models.
arXiv Detail & Related papers (2023-05-18T13:59:02Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Reinforcement Learning to Solve NP-hard Problems: an Application to the
CVRP [0.0]
We evaluate the use of Reinforcement Learning (RL) to solve a classic optimization problem.
We compare two of the most promising RL approaches with traditional solving techniques on a set of benchmark instances.
We find that despite not returning the best solution, the RL approach has many advantages over traditional solvers.
arXiv Detail & Related papers (2022-01-14T11:16:17Z) - An Overview and Experimental Study of Learning-based Optimization
Algorithms for Vehicle Routing Problem [49.04543375851723]
Vehicle routing problem (VRP) is a typical discrete optimization problem.
Many studies consider learning-based optimization algorithms to solve VRP.
This paper reviews recent advances in this field and divides relevant approaches into end-to-end approaches and step-by-step approaches.
arXiv Detail & Related papers (2021-07-15T02:13:03Z) - A Two-stage Framework and Reinforcement Learning-based Optimization
Algorithms for Complex Scheduling Problems [54.61091936472494]
We develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together.
The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively.
Results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems.
arXiv Detail & Related papers (2021-03-10T03:16:12Z) - Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms.
D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions.
We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Geometric Deep Reinforcement Learning for Dynamic DAG Scheduling [8.14784681248878]
In this paper, we propose a reinforcement learning approach to solve a realistic scheduling problem.
We apply it to an algorithm commonly executed in the high performance computing community, the Cholesky factorization.
Our algorithm uses graph neural networks in combination with an actor-critic algorithm (A2C) to build an adaptive representation of the problem on the fly.
arXiv Detail & Related papers (2020-11-09T10:57:21Z) - Reinforcement Learning for Combinatorial Optimization: A Survey [12.323976053967066]
Many traditional algorithms for solving optimization problems involve using hand-crafteds that sequentially construct a solution.
Reinforcement learning (RL) proposes a good alternative to automate the search of theses by training an agent in a supervised or self-supervised manner.
arXiv Detail & Related papers (2020-03-07T16:19:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.