Multi-Vehicle Routing Problems with Soft Time Windows: A Multi-Agent
Reinforcement Learning Approach
- URL: http://arxiv.org/abs/2002.05513v2
- Date: Tue, 27 Oct 2020 09:21:32 GMT
- Title: Multi-Vehicle Routing Problems with Soft Time Windows: A Multi-Agent
Reinforcement Learning Approach
- Authors: Ke Zhang, Meng Li, Zhengchao Zhang, Xi Lin, Fang He
- Abstract summary: Multi-vehicle routing problem with soft time windows (MVRPSTW) is an indispensable constituent in urban logistics systems.
Traditional methods incur the dilemma between computational efficiency and solution quality.
We propose a novel reinforcement learning algorithm called the Multi-Agent Attention Model that can solve routing problem instantly benefit from lengthy offline training.
- Score: 9.717648122961483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-vehicle routing problem with soft time windows (MVRPSTW) is an
indispensable constituent in urban logistics distribution systems. Over the
past decade, numerous methods for MVRPSTW have been proposed, but most are
based on heuristic rules that require a large amount of computation time. With
the current rapid increase of logistics demands, traditional methods incur the
dilemma between computational efficiency and solution quality. To efficiently
solve the problem, we propose a novel reinforcement learning algorithm called
the Multi-Agent Attention Model that can solve routing problem instantly
benefit from lengthy offline training. Specifically, the vehicle routing
problem is regarded as a vehicle tour generation process, and an
encoder-decoder framework with attention layers is proposed to generate tours
of multiple vehicles iteratively. Furthermore, a multi-agent reinforcement
learning method with an unsupervised auxiliary network is developed for the
model training. By evaluated on four synthetic networks with different scales,
the results demonstrate that the proposed method consistently outperforms
Google OR-Tools and traditional methods with little computation time. In
addition, we validate the robustness of the well-trained model by varying the
number of customers and the capacities of vehicles.
Related papers
- A Reinforcement Learning Approach for Dynamic Rebalancing in
Bike-Sharing System [11.237099288412558]
Bike-Sharing Systems provide eco-friendly urban mobility, contributing to the alleviation of traffic congestion and healthier lifestyles.
Devising effective rebalancing strategies using vehicles to redistribute bikes among stations is therefore of uttermost importance for operators.
This paper introduces atemporal reinforcement learning algorithm for the dynamic rebalancing problem with multiple vehicles.
arXiv Detail & Related papers (2024-02-05T23:46:42Z) - Roulette-Wheel Selection-Based PSO Algorithm for Solving the Vehicle
Routing Problem with Time Windows [58.891409372784516]
This paper presents a novel form of the PSO methodology that uses the Roulette Wheel Method (RWPSO)
Experiments using the Solomon VRPTW benchmark datasets on the RWPSO demonstrate that RWPSO is competitive with other state-of-the-art algorithms from the literature.
arXiv Detail & Related papers (2023-06-04T09:18:02Z) - Traj-MAE: Masked Autoencoders for Trajectory Prediction [69.7885837428344]
Trajectory prediction has been a crucial task in building a reliable autonomous driving system by anticipating possible dangers.
We propose an efficient masked autoencoder for trajectory prediction (Traj-MAE) that better represents the complicated behaviors of agents in the driving environment.
Our experimental results in both multi-agent and single-agent settings demonstrate that Traj-MAE achieves competitive results with state-of-the-art methods.
arXiv Detail & Related papers (2023-03-12T16:23:27Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - A deep learning Attention model to solve the Vehicle Routing Problem and
the Pick-up and Delivery Problem with Time Windows [0.0]
SNCF, the French public train company, is experimenting to develop new types of transportation services by tackling vehicle routing problems.
We use an Attention-Decoder structure and design a novel insertion for the feasibility check of the CPDPTW.
Our models yields results that are better than best known learning solutions on the CVRPTW.
arXiv Detail & Related papers (2022-12-20T16:25:55Z) - Solving the capacitated vehicle routing problem with timing windows
using rollouts and MAX-SAT [4.873362301533824]
Vehicle routing is a well known class of NP-hard optimisation problems.
Recent work in reinforcement learning has been a promising alternative approach.
This paper proposes a hybrid approach that combines reinforcement learning, policy rollouts, and a satisfiability.
arXiv Detail & Related papers (2022-06-14T06:27:09Z) - Scalable Vehicle Re-Identification via Self-Supervision [66.2562538902156]
Vehicle Re-Identification is one of the key elements in city-scale vehicle analytics systems.
Many state-of-the-art solutions for vehicle re-id mostly focus on improving the accuracy on existing re-id benchmarks and often ignore computational complexity.
We propose a simple yet effective hybrid solution empowered by self-supervised training which only uses a single network during inference time.
arXiv Detail & Related papers (2022-05-16T12:14:42Z) - Combining Reinforcement Learning and Optimal Transport for the Traveling
Salesman Problem [18.735056206844202]
We show that we can construct a model capable of learning without supervision and inferences significantly faster than current autoregressive approaches.
We also empirically evaluate the benefits of including optimal transport algorithms within deep learning models to enforce assignment constraints during end-to-end training.
arXiv Detail & Related papers (2022-03-02T07:21:56Z) - Supervised Permutation Invariant Networks for Solving the CVRP with
Bounded Fleet Size [3.5235974685889397]
Learning to solve optimization problems, such as the vehicle routing problem, offers great computational advantages.
We propose a powerful supervised deep learning framework that constructs a complete tour plan from scratch while respecting an apriori fixed number of vehicles.
In combination with an efficient post-processing scheme, our supervised approach is not only much faster and easier to train but also competitive results.
arXiv Detail & Related papers (2022-01-05T10:32:18Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms.
D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions.
We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.