Related papers: Solving the Traveling Salesperson Problem with Precedence Constraints by Deep Reinforcement Learning

Solving the Traveling Salesperson Problem with Precedence Constraints by Deep Reinforcement Learning

URL: http://arxiv.org/abs/2207.01443v1
Date: Mon, 4 Jul 2022 14:31:47 GMT
Title: Solving the Traveling Salesperson Problem with Precedence Constraints by Deep Reinforcement Learning
Authors: Christian L\"owens, Muhammad Inaam Ashraf, Alexander Gembus, Genesis Cuizon, Jonas K. Falkner, Lars Schmidt-Thieme
Abstract summary: This work presents solutions to the Traveling Salesperson Problem with precedence constraints (TSPPC) using Deep Reinforcement Learning (DRL) Common to these approaches is the use of graph models based on multi-head attention layers.
Score: 59.14935871979047
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work presents solutions to the Traveling Salesperson Problem with precedence constraints (TSPPC) using Deep Reinforcement Learning (DRL) by adapting recent approaches that work well for regular TSPs. Common to these approaches is the use of graph models based on multi-head attention (MHA) layers. One idea for solving the pickup and delivery problem (PDP) is using heterogeneous attentions to embed the different possible roles each node can take. In this work, we generalize this concept of heterogeneous attentions to the TSPPC. Furthermore, we adapt recent ideas to sparsify attentions for better scalability. Overall, we contribute to the research community through the application and evaluation of recent DRL methods in solving the TSPPC.

Related papers

Learn to Tour: Operator Design For Solution Feasibility Mapping in Pickup-and-delivery Traveling Salesman Problem [12.34897099691387]
This paper develops a learning method for a special class of traveling salesman problems (TSP) It finds the shortest tour along a sequence of one-to-one pickup-and-delivery nodes. In PDTSP, precedence constraints need to be satisfied that each pickup node must be visited before its corresponding delivery node.
arXiv Detail & Related papers (2024-04-17T15:05:51Z)
Leveraging Constraint Programming in a Deep Learning Approach for Dynamically Solving the Flexible Job-Shop Scheduling Problem [1.3927943269211593]
This paper aims to integrate constraint programming (CP) within a deep learning (DL) based methodology, leveraging the benefits of both. We introduce a method that involves training a DL model using optimal solutions generated by CP, ensuring the model learns from high-quality data. Our hybrid approach has been extensively tested on three public FJSSP benchmarks, demonstrating superior performance over five state-of-the-art DRL approaches.
arXiv Detail & Related papers (2024-03-14T10:16:57Z)
Decoupled Prioritized Resampling for Offline RL [120.49021589395005]
We propose Offline Prioritized Experience Replay (OPER) for offline reinforcement learning. OPER features a class of priority functions designed to prioritize highly-rewarding transitions, making them more frequently visited during training. We show that this class of priority functions induce an improved behavior policy, and when constrained to this improved policy, a policy-constrained offline RL algorithm is likely to yield a better solution.
arXiv Detail & Related papers (2023-06-08T17:56:46Z)
Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem [67.32731657297377]
Traveling Salesman Problem (TSP) is a classic routing optimization problem originally arising in the domain of transportation and logistics. Recently, Deep Reinforcement Learning has been increasingly employed to solve TSP due to its high inference efficiency. We propose a novel end-to-end DRL approach, referred to as Pointerformer, based on multi-pointer Transformer.
arXiv Detail & Related papers (2023-04-19T03:48:32Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z)
On the Use of Quality Diversity Algorithms for The Traveling Thief Problem [11.590506672325668]
In real-world optimisation, it is common to face several sub-problems interacting and forming the main problem. In this paper, we investigate the inter-dependency of the traveling salesperson problem(TSP) and the knapsack problem(KP) by means of quality diversity(QD) approaches.
arXiv Detail & Related papers (2021-12-16T05:08:39Z)
Learning Collaborative Policies to Solve NP-hard Routing Problems [13.13675711285772]
This paper proposes a novel hierarchical problem-solving strategy called learning collaborative policies (LCP) It can effectively find the near-optimum solution using two iterative DRL policies: the seeder and reviser. Extensive experiments demonstrate that the proposed two-policies collaboration scheme improves over single-policy DRL framework on various NP-hard routing problems.
arXiv Detail & Related papers (2021-10-26T19:46:21Z)
Heterogeneous Attentions for Solving Pickup and Delivery Problem via Deep Reinforcement Learning [14.627657852087994]
We leverage a novel neural network integrated with a heterogeneous attention mechanism to empower the policy in deep reinforcement learning to automatically select the nodes. In particular, the heterogeneous attention mechanism prescribes attentions for each role of the nodes while taking into account the precedence constraint. Our method outperforms the state-of-the-art and deep learning model, respectively, and generalizes well to different distributions and problem sizes.
arXiv Detail & Related papers (2021-10-06T10:16:07Z)
Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms. D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions. We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.