Related papers: Solve Traveling Salesman Problem by Monte Carlo Tree Search and Deep Neural Network

Solve Traveling Salesman Problem by Monte Carlo Tree Search and Deep Neural Network

URL: http://arxiv.org/abs/2005.06879v1
Date: Thu, 14 May 2020 11:36:40 GMT
Title: Solve Traveling Salesman Problem by Monte Carlo Tree Search and Deep Neural Network
Authors: Zhihao Xing, Shikui Tu, Lei Xu
Abstract summary: We present a self-learning approach that combines deep reinforcement learning and Monte Carlo tree search to solve the traveling salesman problem. Experimental results show that the proposed method performs favorably against other methods in small-to-medium problem settings. It shows comparable performance as state-of-the-art in large problem setting.
Score: 8.19063619210761
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a self-learning approach that combines deep reinforcement learning and Monte Carlo tree search to solve the traveling salesman problem. The proposed approach has two advantages. First, it adopts deep reinforcement learning to compute the value functions for decision, which removes the need of hand-crafted features and labelled data. Second, it uses Monte Carlo tree search to select the best policy by comparing different value functions, which increases its generalization ability. Experimental results show that the proposed method performs favorably against other methods in small-to-medium problem settings. And it shows comparable performance as state-of-the-art in large problem setting.

Related papers

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search [74.46681227410038]
We propose Collective Monte Carlo Tree Search (CoMCTS) for effective and efficient reasoning-path searching and learning. We construct Mulberry-260k, a multimodal dataset with a tree of rich, explicit and well-defined reasoning nodes for each question. We perform collective SFT to train our model, Mulberry, a series of MLLMs with o1-like step-by-step Reasoning and Reflection capabilities.
arXiv Detail & Related papers (2024-12-24T10:07:51Z)
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path. The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z)
LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget. Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z)
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process. We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals. The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z)
UCB-driven Utility Function Search for Multi-objective Reinforcement Learning [75.11267478778295]
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours. We focus on the case of linear utility functions parameterised by weight vectors w. We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process.
arXiv Detail & Related papers (2024-05-01T09:34:42Z)
Maneuver Decision-Making Through Proximal Policy Optimization And Monte Carlo Tree Search [0.0]
Maneuver decision-making can be regarded as a Markov decision process and can be address by reinforcement learning. Agents use random actions in the early stages of training, which makes it difficult to get rewards and learn how to make effective decisions. A method based on proximal policy optimization and Monte Carlo tree search is proposed.
arXiv Detail & Related papers (2023-08-28T14:48:49Z)
Feature Acquisition using Monte Carlo Tree Search [18.76745359031975]
Feature acquisition algorithms address the problem of acquiring informative features while balancing the costs of acquisition to improve the learning performances of ML models. Previous approaches have focused on calculating the expected utility values of features to determine the acquisition sequences. In comparison to previous approaches, we focus on 1) formulating the feature acquisition problem as a MDP and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards for each acquisition step based on model improvements and acquisition costs, and 3) simultaneously optimizing model improvement and acquisition costs with multi-objective Monte Carlo Tree Search.
arXiv Detail & Related papers (2022-12-21T20:53:44Z)
McXai: Local model-agnostic explanation as two games [5.2229999775211216]
This work introduces a reinforcement learning-based approach called Monte Carlo tree search for eXplainable Artificial Intelligent (McXai) to explain the decisions of any black-box classification model (classifier) Our experiments show, that the features found by our method are more informative with respect to classifications than those found by classical approaches like LIME and SHAP.
arXiv Detail & Related papers (2022-01-04T09:02:48Z)
Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning. The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned. Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z)
BCFNet: A Balanced Collaborative Filtering Network with Attention Mechanism [106.43103176833371]
Collaborative Filtering (CF) based recommendation methods have been widely studied. We propose a novel recommendation model named Balanced Collaborative Filtering Network (BCFNet) In addition, an attention mechanism is designed to better capture the hidden information within implicit feedback and strengthen the learning ability of the neural network.
arXiv Detail & Related papers (2021-03-10T14:59:23Z)
Costly Features Classification using Monte Carlo Tree Search [5.188762991286163]
We consider the problem of costly feature classification, where we sequentially select the subset of features to make a balance between the classification error and the feature cost. In this paper, we first cast the task into a MDP problem and use Advantage Actor Critic algorithm to solve it.
arXiv Detail & Related papers (2021-02-14T05:18:33Z)
Exploring search space trees using an adapted version of Monte Carlo tree search for combinatorial optimization problems [0.6882042556551609]
This approach makes use of a algorithm to explore the search space tree of a problem instance. The algorithm is based on Monte Carlo tree search, a popular algorithm in game playing that is used to explore game trees.
arXiv Detail & Related papers (2020-10-22T08:33:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.