Rethinking the "Heatmap + Monte Carlo Tree Search" Paradigm for Solving Large Scale TSP
- URL: http://arxiv.org/abs/2411.09238v1
- Date: Thu, 14 Nov 2024 07:13:08 GMT
- Title: Rethinking the "Heatmap + Monte Carlo Tree Search" Paradigm for Solving Large Scale TSP
- Authors: Xuanhao Pan, Chenguang Wang, Chaolong Ying, Ye Xue, Tianshu Yu,
- Abstract summary: "heatmap + Monte Carlo Tree Search (MCTS)" paradigm has recently gained traction for learning-based solutions.
This paper revisits the "heatmap + Monte Carlo Tree Search (MCTS)" paradigm that has recently gained traction for learning-based solutions.
Our findings demonstrate that a rudimentary and parameter-free heatmap, derived from the intrinsic $k$-nearest nature of the Travelling Salesman Problem, can rival or even surpass the performance of complicated heatmaps.
- Score: 11.388824026057735
- License:
- Abstract: The Travelling Salesman Problem (TSP) remains a fundamental challenge in combinatorial optimization, inspiring diverse algorithmic strategies. This paper revisits the "heatmap + Monte Carlo Tree Search (MCTS)" paradigm that has recently gained traction for learning-based TSP solutions. Within this framework, heatmaps encode the likelihood of edges forming part of the optimal tour, and MCTS refines this probabilistic guidance to discover optimal solutions. Contemporary approaches have predominantly emphasized the refinement of heatmap generation through sophisticated learning models, inadvertently sidelining the critical role of MCTS. Our extensive empirical analysis reveals two pivotal insights: 1) The configuration of MCTS strategies profoundly influences the solution quality, demanding meticulous tuning to leverage their full potential; 2) Our findings demonstrate that a rudimentary and parameter-free heatmap, derived from the intrinsic $k$-nearest nature of TSP, can rival or even surpass the performance of complicated heatmaps, with strong generalizability across various scales. Empirical evaluations across various TSP scales underscore the efficacy of our approach, achieving competitive results. These observations challenge the prevailing focus on heatmap sophistication, advocating a reevaluation of the paradigm to harness both components synergistically. Our code is available at: https://github.com/LOGO-CUHKSZ/rethink_mcts_tsp.
Related papers
- LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Optimized Monte Carlo Tree Search for Enhanced Decision Making in the FrozenLake Environment [0.0]
Monte Carlo Tree Search (MCTS) is a powerful algorithm for solving complex decision-making problems.
This paper presents an optimized MCTS implementation applied to the FrozenLake environment, a classic reinforcement learning task.
arXiv Detail & Related papers (2024-09-25T05:04:53Z) - Position: Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems [14.277867417951347]
Recent advancements in solving large-scale traveling salesman problems utilize the heatmap-guided Monte Carlo tree search (MCTS) paradigm.
We demonstrate that a simple baseline method can outperform complex ML approaches in heatmap generation.
Our findings show the paradigm's inferiority to the LKH-3 despite the paradigm's reliance on problem-specific, hand-crafted strategies.
arXiv Detail & Related papers (2024-06-02T16:11:38Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Generalized Parametric Contrastive Learning [60.62901294843829]
Generalized Parametric Contrastive Learning (GPaCo/PaCo) works well on both imbalanced and balanced data.
Experiments on long-tailed benchmarks manifest the new state-of-the-art for long-tailed recognition.
arXiv Detail & Related papers (2022-09-26T03:49:28Z) - Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency [105.17746223041954]
Reinforcement learning in partially observed Markov decision processes (POMDPs) faces two challenges.
It often takes the full history to predict the future, which induces a sample complexity that scales exponentially with the horizon.
We propose a reinforcement learning algorithm named Embed to Control (ETC), which learns the representation at two levels while optimizing the policy.
arXiv Detail & Related papers (2022-05-26T16:34:46Z) - A Unified Perspective on Value Backup and Exploration in Monte-Carlo
Tree Search [41.11958980731047]
We propose two methods for improving the convergence rate and exploration based on a newly introduced backup operator and entropy regularization.
We show that this theoretical formulation unifies different approaches, including our newly introduced ones, under the same mathematical framework.
In practice, our unified perspective offers a flexible way to balance between exploration and exploitation by tuning the single $alpha$ parameter according to the problem at hand.
arXiv Detail & Related papers (2022-02-11T15:30:08Z) - Generalize a Small Pre-trained Model to Arbitrarily Large TSP Instances [55.64521598173897]
This paper tries to train a small-scale model, which could be repetitively used to build heat maps for the traveling salesman problem (TSP)
Heat maps are fed into a reinforcement learning approach (Monte Carlo tree search) to guide the search of high-quality solutions.
Experimental results show that, this new approach clearly outperforms the existing machine learning based TSP algorithms.
arXiv Detail & Related papers (2020-12-19T11:06:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.