On Effective Parallelization of Monte Carlo Tree Search
- URL: http://arxiv.org/abs/2006.08785v2
- Date: Sun, 4 Oct 2020 21:13:55 GMT
- Title: On Effective Parallelization of Monte Carlo Tree Search
- Authors: Anji Liu and Yitao Liang and Ji Liu and Guy Van den Broeck and Jianshu
Chen
- Abstract summary: Monte Carlo Tree Search (MCTS) is computationally expensive as it requires a substantial number of rollouts to construct the search tree.
How to design effective parallel MCTS algorithms has not been systematically studied and remains poorly understood.
We demonstrate how proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.
- Score: 51.15940034629022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite its groundbreaking success in Go and computer games, Monte Carlo Tree
Search (MCTS) is computationally expensive as it requires a substantial number
of rollouts to construct the search tree, which calls for effective
parallelization. However, how to design effective parallel MCTS algorithms has
not been systematically studied and remains poorly understood. In this paper,
we seek to lay its first theoretical foundation, by examining the potential
performance loss caused by parallelization when achieving a desired speedup. In
particular, we discover the necessary conditions of achieving a desirable
parallelization performance, and highlight two of their practical benefits.
First, by examining whether existing parallel MCTS algorithms satisfy these
conditions, we identify key design principles that should be inherited by
future algorithms, for example tracking the unobserved samples (used in WU-UCT
(Liu et al., 2020)). We theoretically establish this essential design
facilitates $\mathcal{O} ( \ln n + M / \sqrt{\ln n} )$ cumulative regret when
the maximum tree depth is 2, where $n$ is the number of rollouts and $M$ is the
number of workers. A regret of this form is highly desirable, as compared to
$\mathcal{O} ( \ln n )$ regret incurred by a sequential counterpart, its excess
part approaches zero as $n$ increases. Second, and more importantly, we
demonstrate how the proposed necessary conditions can be adopted to design more
effective parallel MCTS algorithms. To illustrate this, we propose a new
parallel MCTS algorithm, called BU-UCT, by following our theoretical
guidelines. The newly proposed algorithm, albeit preliminary, out-performs four
competitive baselines on 11 out of 15 Atari games. We hope our theoretical
results could inspire future work of more effective parallel MCTS.
Related papers
- LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget.
Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z) - Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning [15.46907000938726]
We present the first study on provably efficient randomized exploration in cooperative multi-agent reinforcement learning (MARL)
We propose a unified algorithm framework for randomized exploration in parallel Markov Decision Processes (MDPs), and two Thompson Sampling (TS)-type algorithms, CoopTS-PHE and CoopTS-LMC.
We evaluate our proposed method on multiple parallel RL environments, including a deep exploration problem (textiti.e., $N$-chain), a video game, and a real-world problem in energy systems.
arXiv Detail & Related papers (2024-04-16T17:01:38Z) - No Compromise in Solution Quality: Speeding Up Belief-dependent Continuous POMDPs via Adaptive Multilevel Simplification [6.300736240833814]
Continuous POMDPs with general belief-dependent rewards are notoriously difficult to solve online.
We present a complete provable theory of adaptive multilevel simplification for the setting of a given externally constructed belief tree.
We present three algorithms to accelerate continuous POMDP online planning with belief-dependent rewards.
arXiv Detail & Related papers (2023-10-16T10:59:22Z) - Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions [89.89612827542972]
This paper proposes a variant of Monte-Carlo tree search (MCTS) that spends more search time on harder states and less search time on simpler states adaptively.
We evaluate the performance and computations on $9 times 9$ Go board games and Atari games.
Experiments show that our method can achieve comparable performances to the original search algorithm while requiring less than $50%$ search time on average.
arXiv Detail & Related papers (2022-10-23T06:39:20Z) - Best of Both Worlds: Practical and Theoretically Optimal Submodular Maximization in Parallel [17.462968952951883]
Main algorithm is assembled from two components which may be of independent interest.
A variant of LINEARSEQ is shown to have adaptive complexity of $O(log(n))$ smaller than that of any previous algorithm in the literature.
arXiv Detail & Related papers (2021-11-15T17:10:40Z) - Under-bagging Nearest Neighbors for Imbalanced Classification [63.026765294759876]
We propose an ensemble learning algorithm called textitunder-bagging $k$-NN (textitunder-bagging $k$-NN) for imbalanced classification problems.
arXiv Detail & Related papers (2021-09-01T14:10:38Z) - Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary.
Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z) - Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and
Hierarchical Spatial Clustering [6.4805900740861]
We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN$*$.
We show that our algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (parallel time)
Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest algorithms outperform the best serial algorithms for the problems by 11.13--55.89x, and existing parallel algorithms by at least an order of magnitude.
arXiv Detail & Related papers (2021-04-02T16:05:00Z) - Efficient Computation of Expectations under Spanning Tree Distributions [67.71280539312536]
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models.
Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms.
arXiv Detail & Related papers (2020-08-29T14:58:26Z) - Practical Massively Parallel Monte-Carlo Tree Search Applied to
Molecular Design [7.992550355579791]
We propose a novel massively parallel Monte-Carlo Tree Search (MP-MCTS) algorithm that works efficiently for 1,000 worker scale, and apply it to molecular design.
MP-MCTS maintains the search quality at larger scale, and by running MP-MCTS on 256 CPU cores for only 10 minutes, we obtained candidate molecules having similar score to non-parallel MCTS running for 42 hours.
arXiv Detail & Related papers (2020-06-18T13:23:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.