Related papers: Bandit based Dynamic Candidate Edge Selection in Solving Traveling Salesman Problems

Bandit based Dynamic Candidate Edge Selection in Solving Traveling Salesman Problems

URL: http://arxiv.org/abs/2505.15862v2
Date: Thu, 05 Jun 2025 06:16:06 GMT
Title: Bandit based Dynamic Candidate Edge Selection in Solving Traveling Salesman Problems
Authors: Long Wang, Jiongzhi Zheng, Zhengda Xiong, ChuMin Li, Kun He,
Abstract summary: We introduce a bandit-based method for the Lin-Kernighan-Helsgaun (LKH) algorithm for the Traveling Salesman Problem (TSP)<n>We incorporate multi-armed bandit models to dynamically select the most suitable candidate edges in each iteration, enabling LKH to make smarter choices and lead to improved solutions.
Score: 13.620917861669607
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Algorithms designed for routing problems typically rely on high-quality candidate edges to guide their search, aiming to reduce the search space and enhance the search efficiency. However, many existing algorithms, like the classical Lin-Kernighan-Helsgaun (LKH) algorithm for the Traveling Salesman Problem (TSP), often use predetermined candidate edges that remain static throughout local searches. This rigidity could cause the algorithm to get trapped in local optima, limiting its potential to find better solutions. To address this issue, we propose expanding the candidate sets to include other promising edges, providing them an opportunity for selection. Specifically, we incorporate multi-armed bandit models to dynamically select the most suitable candidate edges in each iteration, enabling LKH to make smarter choices and lead to improved solutions. Extensive experiments on multiple TSP benchmarks show the excellent performance of our method. Moreover, we employ this bandit-based method to LKH-3, an extension of LKH tailored for solving various TSP variant problems, and our method also significantly enhances LKH-3's performance across typical TSP variants.

Related papers

Multi-armed Bandit and Backbone boost Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problems [9.035853156510507]
Lin-Kernighan-Helsguan (LKH) is a classic local search algorithm for the Traveling Salesman Problem (TSP)<n>LKH introduces an $alpha$-value to replace the traditional distance metric for evaluating the edge quality.<n>We propose a novel way to extract backbone information during the TSP local search process, which is dynamic and can be updated once a local optimal solution is found.
arXiv Detail & Related papers (2025-01-07T16:45:41Z)
A History-Guided Regional Partitioning Evolutionary Optimization for Solving the Flexible Job Shop Problem with Limited Multi-load Automated Guided Vehicles [6.3926046314748834]
This study proposes a history-guided regional partitioning algorithm (HRPEO) for the flexible job shop scheduling problem with limited multi-load AGVs. The results show that the HRPEO has a better advantage in solving FJSPMA.
arXiv Detail & Related papers (2024-09-27T13:33:19Z)
Dancing to the State of the Art? How Candidate Lists Influence LKH for Solving the Traveling Salesperson Problem [1.6874375111244329]
Solving the Traveling Salesperson Problem (TSP) remains a persistent challenge. Heuristic solvers address the demand for finding high-quality solutions efficiently. Among these solvers, the Lin-Kernighan-Helsgaun (LKH) stands out, as it complements the performance of genetic algorithms.
arXiv Detail & Related papers (2024-07-04T13:38:19Z)
Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization [52.80408805368928]
We introduce a novel greedy-style subset selection algorithm for batch acquisition. Our experiments on the red fluorescent proteins show that our proposed method achieves the baseline performance in 1.69x fewer queries.
arXiv Detail & Related papers (2024-06-21T05:57:08Z)
Stop Relying on No-Choice and Do not Repeat the Moves: Optimal, Efficient and Practical Algorithms for Assortment Optimization [38.57171985309975]
We develop efficient algorithms for the problem of regret in assortment selection with emphPlackett Luce (PL) based user choices. Our methods are practical, provably optimal, and devoid of the aforementioned limitations of the existing methods.
arXiv Detail & Related papers (2024-02-29T07:17:04Z)
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods. Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
Online Control of Adaptive Large Neighborhood Search using Deep Reinforcement Learning [4.374837991804085]
We introduce a Deep Reinforcement Learning based approach called DR-ALNS that selects operators, adjusts parameters, and controls the acceptance criterion throughout the search. We evaluate the proposed method on a problem with orienteering weights and time windows, as presented in an IJCAI competition. The results show that our approach outperforms vanilla ALNS, ALNS tuned with Bayesian optimization, and two state-of-the-art DRL approaches.
arXiv Detail & Related papers (2022-11-01T21:33:46Z)
Reinforced Lin-Kernighan-Helsgaun Algorithms for the Traveling Salesman Problems [17.564543997481508]
The Lin-Kernighan-Helsgaun (LKH) algorithm is one of the state-of-the-art local search algorithms for the Traveling Salesman Problem (TSP) LKH and LKH-3 associate a candidate set to each city to improve the algorithm efficiency, and have two different methods, called $alpha$-measure and POPMUSIC, to decide the candidate sets. In this work, we first propose a Variable Strategy Reinforced LKH (VSR-LKH) algorithm, which incorporates three reinforcement learning methods (Q-learning, Sarsa, and Monte Carlo)
arXiv Detail & Related papers (2022-07-08T13:03:29Z)
A Metaheuristic Algorithm for Large Maximum Weight Independent Set Problems [58.348679046591265]
Given a node-weighted graph, find a set of independent (mutually nonadjacent) nodes whose node-weight sum is maximum. Some of the graphs airsing in this application are large, having hundreds of thousands of nodes and hundreds of millions of edges. We develop a new local search algorithm, which is a metaheuristic in the greedy randomized adaptive search framework.
arXiv Detail & Related papers (2022-03-28T21:34:16Z)
Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms. For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime. In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem. We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z)
Exploration in two-stage recommender systems [79.50534282841618]
Two-stage recommender systems are widely adopted in industry due to their scalability and maintainability. A key challenge of this setup is that optimal performance of each stage in isolation does not imply optimal global performance. We propose a method of synchronising the exploration strategies between the ranker and the nominators.
arXiv Detail & Related papers (2020-09-01T16:52:51Z)
Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms. We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.