Dual Monte Carlo Tree Search
- URL: http://arxiv.org/abs/2103.11517v1
- Date: Sun, 21 Mar 2021 23:34:11 GMT
- Title: Dual Monte Carlo Tree Search
- Authors: Prashank Kadam, Ruiyang Xu, Karl Lieberherr
- Abstract summary: We show that Dual MCTS performs better than one of the most widely used neural MCTS algorithms, AlphaZero, for various symmetric and asymmetric games.
Dual MCTS uses two different search trees, a single deep neural network, and a new update technique for the search trees using a combination of the PUCB, a sliding-window, and the epsilon-greedy algorithm.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AlphaZero, using a combination of Deep Neural Networks and Monte Carlo Tree
Search (MCTS), has successfully trained reinforcement learning agents in a
tabula-rasa way. The neural MCTS algorithm has been successful in finding
near-optimal strategies for games through self-play. However, the AlphaZero
algorithm has a significant drawback; it takes a long time to converge and
requires high computational power due to complex neural networks for solving
games like Chess, Go, Shogi, etc. Owing to this, it is very difficult to pursue
neural MCTS research without cutting-edge hardware, which is a roadblock for
many aspiring neural MCTS researchers. In this paper, we propose a new neural
MCTS algorithm, called Dual MCTS, which helps overcome these drawbacks. Dual
MCTS uses two different search trees, a single deep neural network, and a new
update technique for the search trees using a combination of the PUCB, a
sliding-window, and the epsilon-greedy algorithm. This technique is applicable
to any MCTS based algorithm to reduce the number of updates to the tree. We
show that Dual MCTS performs better than one of the most widely used neural
MCTS algorithms, AlphaZero, for various symmetric and asymmetric games.
Related papers
- LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget.
Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z) - Monte-Carlo Tree Search for Multi-Agent Pathfinding: Preliminary Results [60.4817465598352]
We introduce an original variant of Monte-Carlo Tree Search (MCTS) tailored to multi-agent pathfinding.
Specifically, we use individual paths to assist the agents with the the goal-reaching behavior.
We also use a dedicated decomposition technique to reduce the branching factor of the tree search procedure.
arXiv Detail & Related papers (2023-07-25T12:33:53Z) - Proof Number Based Monte-Carlo Tree Search [1.93674821880689]
This paper proposes a new game-search algorithm, PN-MCTS, which combines Monte-Carlo Tree Search (MCTS) and Proof-Number Search (PNS)
We define three areas where the additional knowledge provided by the proof and disproof numbers gathered in MCTS trees might be used.
Experiments show that PN-MCTS is able to outperform MCTS in all tested game domains, achieving win rates up to 96.2% for Lines of Action.
arXiv Detail & Related papers (2023-03-16T16:27:07Z) - Alphazzle: Jigsaw Puzzle Solver with Deep Monte-Carlo Tree Search [30.43614740245788]
We introduce Alphazzle, a reassembly algorithm based on single-player Monte Carlo Tree Search (MCTS)
A major difference with DRL algorithms lies in the unavailability of game reward for MCTS.
We perform an in-deep ablation study that shows the importance of MCTS and the neural networks working together.
arXiv Detail & Related papers (2023-02-01T11:41:21Z) - Learning Contextual Bandits Through Perturbed Rewards [107.6210145983805]
We show that a $tildeO(tildedsqrtT)$ regret upper bound is still achievable under standard regularity conditions.
We perturb the rewards when updating the neural network to eliminate the need of explicit exploration.
arXiv Detail & Related papers (2022-01-24T19:10:22Z) - On the Evolution of the MCTS Upper Confidence Bounds for Trees by Means
of Evolutionary Algorithms in the Game of Carcassonne [0.0]
Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for optimal decisions.
We use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the Upper Confidence Bounds for Trees (UCT) mathematical expression.
We show how the ES-MCTS controller, is able to outperform all these 10 intelligent controllers, including robust UCT controllers.
arXiv Detail & Related papers (2021-12-17T18:06:21Z) - Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search [66.34387649910046]
Monte Carlo tree search (MCTS) has achieved state-of-the-art results in many domains such as Go and Atari games.
We propose to achieve this goal by predicting the uncertainty of the current searching status and use the result to decide whether we should stop searching.
arXiv Detail & Related papers (2020-12-14T19:49:25Z) - On Effective Parallelization of Monte Carlo Tree Search [51.15940034629022]
Monte Carlo Tree Search (MCTS) is computationally expensive as it requires a substantial number of rollouts to construct the search tree.
How to design effective parallel MCTS algorithms has not been systematically studied and remains poorly understood.
We demonstrate how proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.
arXiv Detail & Related papers (2020-06-15T21:36:00Z) - StarCraft II Build Order Optimization using Deep Reinforcement Learning
and Monte-Carlo Tree Search [0.0]
This study examines the use of an agent based on the Monte-Carlo Tree Search algorithm for optimizing the build order in StarCraft II.
It discusses how its performance can be improved even further by combining it with a deep reinforcement learning neural network.
arXiv Detail & Related papers (2020-06-12T08:53:52Z) - Single-Agent Optimization Through Policy Iteration Using Monte-Carlo
Tree Search [8.22379888383833]
Combination of Monte-Carlo Tree Search (MCTS) and deep reinforcement learning is state-of-the-art in two-player perfect-information games.
We describe a search algorithm that uses a variant of MCTS which we enhanced by 1) a novel action value normalization mechanism for games with potentially unbounded rewards, 2) defining a virtual loss function that enables effective search parallelization, and 3) a policy network, trained by generations of self-play, to guide the search.
arXiv Detail & Related papers (2020-05-22T18:02:36Z) - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch [76.83052807776276]
We show that it is possible to automatically discover complete machine learning algorithms just using basic mathematical operations as building blocks.
We demonstrate this by introducing a novel framework that significantly reduces human bias through a generic search space.
We believe these preliminary successes in discovering machine learning algorithms from scratch indicate a promising new direction in the field.
arXiv Detail & Related papers (2020-03-06T19:00:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.