Related papers: H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman Problem

H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman Problem

URL: http://arxiv.org/abs/2304.09395v1
Date: Wed, 19 Apr 2023 03:10:30 GMT
Title: H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman Problem
Authors: Xuanhao Pan, Yan Jin, Yuandong Ding, Mingxiao Feng, Li Zhao, Lei Song, Jiang Bian
Abstract summary: We propose an end-to-end learning framework based on hierarchical reinforcement learning, called H-TSP, for addressing the large-scale Travelling Salesman Problem (TSP) We show that H-TSP can achieve comparable results as SOTA search-based approaches, and we reduce the time consumption up to two orders of magnitude (3.32s vs. 395.85s) Although there are still gaps to SOTA results with respect to solution quality, we believe H-TSP will be useful for practical applications, particularly those that are time-sensitive e.g., on-call routing and ride
Score: 11.310986634385742
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose an end-to-end learning framework based on hierarchical reinforcement learning, called H-TSP, for addressing the large-scale Travelling Salesman Problem (TSP). The proposed H-TSP constructs a solution of a TSP instance starting from the scratch relying on two components: the upper-level policy chooses a small subset of nodes (up to 200 in our experiment) from all nodes that are to be traversed, while the lower-level policy takes the chosen nodes as input and outputs a tour connecting them to the existing partial route (initially only containing the depot). After jointly training the upper-level and lower-level policies, our approach can directly generate solutions for the given TSP instances without relying on any time-consuming search procedures. To demonstrate effectiveness of the proposed approach, we have conducted extensive experiments on randomly generated TSP instances with different numbers of nodes. We show that H-TSP can achieve comparable results (gap 3.42% vs. 7.32%) as SOTA search-based approaches, and more importantly, we reduce the time consumption up to two orders of magnitude (3.32s vs. 395.85s). To the best of our knowledge, H-TSP is the first end-to-end deep reinforcement learning approach that can scale to TSP instances of up to 10000 nodes. Although there are still gaps to SOTA results with respect to solution quality, we believe that H-TSP will be useful for practical applications, particularly those that are time-sensitive e.g., on-call routing and ride hailing service.

Related papers

How to Train Your LLM Web Agent: A Statistical Diagnosis [102.04125085041473]
We present the first statistically grounded study on compute allocation for LLM web-agent post-training.<n>Our approach uses a two-stage pipeline, training a Llama 3.1 8B student to imitate a Llama 3.3 70B teacher via supervised fine-tuning (SFT) and on-policy reinforcement learning.<n>Our results show that combining SFT with on-policy RL consistently outperforms either approach alone on both WorkArena and MiniWob++.
arXiv Detail & Related papers (2025-07-05T17:12:33Z)
Cascaded Large-Scale TSP Solving with Unified Neural Guidance: Bridging Local and Population-based Search [23.528483405321975]
UNiCS is a novel unified neural-guided cascaded solver for the traveling salesman problem. It consistently outperforms state-of-the-art methods, with its runtime advantage consistent across various budgets.
arXiv Detail & Related papers (2025-01-24T06:56:27Z)
Start from Zero: Triple Set Prediction for Automatic Knowledge Graph Completion [49.19814695500355]
We propose a graph-level automatic KG completion task called Triple Set Prediction (TSP) TSP assumes none of the elements in the missing triples is given. To tackle the huge candidate triples for prediction, we propose a novel and efficient subgraph-based method GPHT.
arXiv Detail & Related papers (2024-06-26T08:26:32Z)
Learn to Tour: Operator Design For Solution Feasibility Mapping in Pickup-and-delivery Traveling Salesman Problem [12.34897099691387]
This paper develops a learning method for a special class of traveling salesman problems (TSP) It finds the shortest tour along a sequence of one-to-one pickup-and-delivery nodes. In PDTSP, precedence constraints need to be satisfied that each pickup node must be visited before its corresponding delivery node.
arXiv Detail & Related papers (2024-04-17T15:05:51Z)
Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge [85.17343729885003]
We introduce Hint-before-Solving Prompting (HSP), which guides the model to generate hints for solving the problem. HSP can effectively improve the accuracy of reasoning tasks. We build the HSPMATH dataset based on HSP and fine-tuned Llemma-7B, reaching 64.3 accuracy.
arXiv Detail & Related papers (2024-02-22T05:58:03Z)
Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem [67.32731657297377]
Traveling Salesman Problem (TSP) is a classic routing optimization problem originally arising in the domain of transportation and logistics. Recently, Deep Reinforcement Learning has been increasingly employed to solve TSP due to its high inference efficiency. We propose a novel end-to-end DRL approach, referred to as Pointerformer, based on multi-pointer Transformer.
arXiv Detail & Related papers (2023-04-19T03:48:32Z)
DPTDR: Deep Prompt Tuning for Dense Passage Retrieval [53.217524851268216]
Deep prompt tuning (DPT) has gained great success in most natural language processing(NLP) tasks. However, it is not well-investigated in dense retrieval where fine-tuning(FT) still dominates. We propose two model-agnostic and task-agnostic strategies for DPT-based retrievers, namely retrieval-oriented intermediate pretraining and unified negative mining.
arXiv Detail & Related papers (2022-08-24T12:55:00Z)
Improving Generalization of Deep Reinforcement Learning-based TSP Solvers [19.29028564568974]
We propose a novel approach named MAGIC that includes a deep learning architecture and a DRL training method. Our architecture, which integrates a multilayer perceptron, a graph neural network, and an attention model, defines a policy that sequentially generates a traveling salesman solution. Our training method includes several innovations: (1) we interleave DRL policy updates with local search (using a new local search technique), (2) we use a novel simple baseline, and (3) we apply gradient learning.
arXiv Detail & Related papers (2021-10-06T15:16:19Z)
Learning Semantic Segmentation of Large-Scale Point Clouds with Random Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z)
Learning to Optimise General TSP Instances [2.6782615615913348]
The Travelling Salesman Problem (TSP) is a classical optimisation problem. Deep learning has been successfully extended to meta-learning, where previous solving efforts assist in learning how to optimise future instances. This paper introduces a new learning-based approach to solve a variety of different and common TSP problems.
arXiv Detail & Related papers (2020-10-23T07:37:16Z)
Solving the Clustered Traveling Salesman Problem via TSP methods [16.304413942851397]
The Clustered Traveling Salesman Problem (CTSP) is a variant of the popular Traveling Salesman Problem (TSP) In this work, we explore a transformation approach that solves the CTSP by converting it to the well-studied TSP.
arXiv Detail & Related papers (2020-07-10T08:56:06Z)
Towards Feature-free TSP Solver Selection: A Deep Learning Approach [35.05032046810422]
The Travelling Salesman Problem (TSP) is a classical NP-hard problem and has broad applications in many disciplines and industries. In this paper, we propose a deep learning framework, CTAS, for TSP solver selection in an end-to-end manner.
arXiv Detail & Related papers (2020-06-01T04:48:36Z)
MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate. We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network. Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.