Monte Carlo Search Algorithms Discovering Monte Carlo Tree Search Exploration Terms
- URL: http://arxiv.org/abs/2404.09304v1
- Date: Sun, 14 Apr 2024 17:06:20 GMT
- Title: Monte Carlo Search Algorithms Discovering Monte Carlo Tree Search Exploration Terms
- Authors: Tristan Cazenave,
- Abstract summary: The optimized Monte Carlo Tree Search algorithms are PUCT and SHUSS.
For small search budgets of 32 evaluations the discovered root exploration terms make both algorithms competitive.
- Score: 4.561007128508218
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Monte Carlo Tree Search and Monte Carlo Search have good results for many combinatorial problems. In this paper we propose to use Monte Carlo Search to design mathematical expressions that are used as exploration terms for Monte Carlo Tree Search algorithms. The optimized Monte Carlo Tree Search algorithms are PUCT and SHUSS. We automatically design the PUCT and the SHUSS root exploration terms. For small search budgets of 32 evaluations the discovered root exploration terms make both algorithms competitive with usual PUCT.
Related papers
- RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation [65.5353313491402]
We introduce RethinkMCTS, which employs the Monte Carlo Tree Search (MCTS) algorithm to conduct thought-level searches before generating code.
We construct verbal feedback from fine-turbo code execution feedback to refine erroneous thoughts during the search.
We demonstrate that RethinkMCTS outperforms previous search-based and feedback-based code generation baselines.
arXiv Detail & Related papers (2024-09-15T02:07:28Z) - Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization [18.25487451605638]
We derive a tree search algorithm based on policy optimization with state occupancy measure regularization, which we call it Volume-MCTS
We show that count-based exploration and sampling-based motion planning can be derived as approximate solutions to this state occupancy measure regularized objective.
We test our method on several robot navigation problems, and find that Volume-MCTS outperforms AlphaZero and displays significantly better long-horizon exploration properties.
arXiv Detail & Related papers (2024-07-07T22:58:52Z) - LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget.
Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z) - Monte-Carlo Tree Search for Multi-Agent Pathfinding: Preliminary Results [60.4817465598352]
We introduce an original variant of Monte-Carlo Tree Search (MCTS) tailored to multi-agent pathfinding.
Specifically, we use individual paths to assist the agents with the the goal-reaching behavior.
We also use a dedicated decomposition technique to reduce the branching factor of the tree search procedure.
arXiv Detail & Related papers (2023-07-25T12:33:53Z) - CrossBeam: Learning to Search in Bottom-Up Program Synthesis [51.37514793318815]
We propose training a neural model to learn a hands-on search policy for bottom-up synthesis.
Our approach, called CrossBeam, uses the neural model to choose how to combine previously-explored programs into new programs.
We observe that CrossBeam learns to search efficiently, exploring much smaller portions of the program space compared to the state-of-the-art.
arXiv Detail & Related papers (2022-03-20T04:41:05Z) - Batch Monte Carlo Tree Search [9.114710429587479]
We build on this property to propose Monte Carlo Tree Search algorithms using batched inferences.
The transposition table contains the results of the inferences while the search tree contains the statistics of Monte Carlo Tree Search.
We also propose to analyze multiple algorithms that improve the search: the $mu$ FPU, the Virtual Mean, the Iteration and the Second Move Lasts.
arXiv Detail & Related papers (2021-04-09T09:54:21Z) - Prioritized Architecture Sampling with Monto-Carlo Tree Search [54.72096546595955]
One-shot neural architecture search (NAS) methods significantly reduce the search cost by considering the whole search space as one network.
In this paper, we introduce a sampling strategy based on Monte Carlo tree search (MCTS) with the search space modeled as a Monte Carlo tree (MCT)
For a fair comparison, we construct an open-source NAS benchmark of a macro search space evaluated on CIFAR-10, namely NAS-Bench-Macro.
arXiv Detail & Related papers (2021-03-22T15:09:29Z) - Monte-Carlo Graph Search for AlphaZero [15.567057178736402]
We introduce a new, improved search algorithm for AlphaZero that generalizes the search tree to a directed acyclic graph.
In our evaluations, we use the CrazyAra engine on chess and crazyhouse as examples to show that these changes bring significant improvements to AlphaZero.
arXiv Detail & Related papers (2020-12-20T22:51:38Z) - Monte-Carlo Tree Search as Regularized Policy Optimization [47.541849128047865]
We show that AlphaZero's search algorithms are an approximation to the solution of a specific regularized policy optimization problem.
We propose a variant of AlphaZero which uses the exact solution to this policy optimization problem, and show experimentally that it reliably outperforms the original algorithm in multiple domains.
arXiv Detail & Related papers (2020-07-24T13:01:34Z) - Competing in a Complex Hidden Role Game with Information Set Monte Carlo
Tree Search [0.0]
The Information Set Monte Carlo Tree Search (ISMCTS) family of algorithms outperforms previous algorithms using Monte Carlo methods in imperfect information games.
This paper is applied to Secret Hitler, a popular social deduction board game that combines traditional hidden role mechanics with the randomness of a card deck.
arXiv Detail & Related papers (2020-05-14T17:21:10Z) - Monte Carlo Game Solver [4.38602607138044]
It uses online learning of playout policies and Monte Carlo Tree Search.
The learned policy and the information in the Monte Carlo tree are used to order moves in game solvers.
arXiv Detail & Related papers (2020-01-15T00:20:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.