Solving the Traveling Salesperson Problem with Precedence Constraints by
Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2207.01443v1
- Date: Mon, 4 Jul 2022 14:31:47 GMT
- Title: Solving the Traveling Salesperson Problem with Precedence Constraints by
Deep Reinforcement Learning
- Authors: Christian L\"owens, Muhammad Inaam Ashraf, Alexander Gembus, Genesis
Cuizon, Jonas K. Falkner, Lars Schmidt-Thieme
- Abstract summary: This work presents solutions to the Traveling Salesperson Problem with precedence constraints (TSPPC) using Deep Reinforcement Learning (DRL)
Common to these approaches is the use of graph models based on multi-head attention layers.
- Score: 59.14935871979047
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work presents solutions to the Traveling Salesperson Problem with
precedence constraints (TSPPC) using Deep Reinforcement Learning (DRL) by
adapting recent approaches that work well for regular TSPs. Common to these
approaches is the use of graph models based on multi-head attention (MHA)
layers. One idea for solving the pickup and delivery problem (PDP) is using
heterogeneous attentions to embed the different possible roles each node can
take. In this work, we generalize this concept of heterogeneous attentions to
the TSPPC. Furthermore, we adapt recent ideas to sparsify attentions for better
scalability. Overall, we contribute to the research community through the
application and evaluation of recent DRL methods in solving the TSPPC.
Related papers
- Learn to Tour: Operator Design For Solution Feasibility Mapping in Pickup-and-delivery Traveling Salesman Problem [12.34897099691387]
This paper develops a learning method for a special class of traveling salesman problems (TSP)
It finds the shortest tour along a sequence of one-to-one pickup-and-delivery nodes.
In PDTSP, precedence constraints need to be satisfied that each pickup node must be visited before its corresponding delivery node.
arXiv Detail & Related papers (2024-04-17T15:05:51Z) - Leveraging Constraint Programming in a Deep Learning Approach for Dynamically Solving the Flexible Job-Shop Scheduling Problem [1.3927943269211593]
This paper aims to integrate constraint programming (CP) within a deep learning (DL) based methodology, leveraging the benefits of both.
We introduce a method that involves training a DL model using optimal solutions generated by CP, ensuring the model learns from high-quality data.
Our hybrid approach has been extensively tested on three public FJSSP benchmarks, demonstrating superior performance over five state-of-the-art DRL approaches.
arXiv Detail & Related papers (2024-03-14T10:16:57Z) - Decoupled Prioritized Resampling for Offline RL [120.49021589395005]
We propose Offline Prioritized Experience Replay (OPER) for offline reinforcement learning.
OPER features a class of priority functions designed to prioritize highly-rewarding transitions, making them more frequently visited during training.
We show that this class of priority functions induce an improved behavior policy, and when constrained to this improved policy, a policy-constrained offline RL algorithm is likely to yield a better solution.
arXiv Detail & Related papers (2023-06-08T17:56:46Z) - Pointerformer: Deep Reinforced Multi-Pointer Transformer for the
Traveling Salesman Problem [67.32731657297377]
Traveling Salesman Problem (TSP) is a classic routing optimization problem originally arising in the domain of transportation and logistics.
Recently, Deep Reinforcement Learning has been increasingly employed to solve TSP due to its high inference efficiency.
We propose a novel end-to-end DRL approach, referred to as Pointerformer, based on multi-pointer Transformer.
arXiv Detail & Related papers (2023-04-19T03:48:32Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL.
We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task.
We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z) - On the Use of Quality Diversity Algorithms for The Traveling Thief
Problem [11.590506672325668]
In real-world optimisation, it is common to face several sub-problems interacting and forming the main problem.
In this paper, we investigate the inter-dependency of the traveling salesperson problem(TSP) and the knapsack problem(KP) by means of quality diversity(QD) approaches.
arXiv Detail & Related papers (2021-12-16T05:08:39Z) - Learning Collaborative Policies to Solve NP-hard Routing Problems [13.13675711285772]
This paper proposes a novel hierarchical problem-solving strategy called learning collaborative policies (LCP)
It can effectively find the near-optimum solution using two iterative DRL policies: the seeder and reviser.
Extensive experiments demonstrate that the proposed two-policies collaboration scheme improves over single-policy DRL framework on various NP-hard routing problems.
arXiv Detail & Related papers (2021-10-26T19:46:21Z) - Heterogeneous Attentions for Solving Pickup and Delivery Problem via
Deep Reinforcement Learning [14.627657852087994]
We leverage a novel neural network integrated with a heterogeneous attention mechanism to empower the policy in deep reinforcement learning to automatically select the nodes.
In particular, the heterogeneous attention mechanism prescribes attentions for each role of the nodes while taking into account the precedence constraint.
Our method outperforms the state-of-the-art and deep learning model, respectively, and generalizes well to different distributions and problem sizes.
arXiv Detail & Related papers (2021-10-06T10:16:07Z) - Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms.
D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions.
We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.