Accelerating Monte Carlo Tree Search with Probability Tree State
Abstraction
- URL: http://arxiv.org/abs/2310.06513v1
- Date: Tue, 10 Oct 2023 10:55:12 GMT
- Title: Accelerating Monte Carlo Tree Search with Probability Tree State
Abstraction
- Authors: Yangqing Fu, Ming Sun, Buqing Nie, Yue Gao
- Abstract summary: We propose a novel probability tree state abstraction (PTSA) algorithm to improve the search efficiency of Monte Carlo Tree Search (MCTS)
A general tree state abstraction with path transitivity is defined. In addition, the probability tree state abstraction is proposed for fewer mistakes during the aggregation step.
Experimental results on different tasks demonstrate that our method can accelerate the training process of state-of-the-art algorithms with 10%-45% search space reduction.
- Score: 11.49169644917995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monte Carlo Tree Search (MCTS) algorithms such as AlphaGo and MuZero have
achieved superhuman performance in many challenging tasks. However, the
computational complexity of MCTS-based algorithms is influenced by the size of
the search space. To address this issue, we propose a novel probability tree
state abstraction (PTSA) algorithm to improve the search efficiency of MCTS. A
general tree state abstraction with path transitivity is defined. In addition,
the probability tree state abstraction is proposed for fewer mistakes during
the aggregation step. Furthermore, the theoretical guarantees of the
transitivity and aggregation error bound are justified. To evaluate the
effectiveness of the PTSA algorithm, we integrate it with state-of-the-art
MCTS-based algorithms, such as Sampled MuZero and Gumbel MuZero. Experimental
results on different tasks demonstrate that our method can accelerate the
training process of state-of-the-art algorithms with 10%-45% search space
reduction.
Related papers
- LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget.
Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z) - An optimized quantum minimum searching algorithm with sure-success
probability and its experiment simulation with Cirq [4.987556615430226]
We propose an optimized quantum minimum searching algorithm with sure-success probability.
The performance evaluation shows that our algorithm has higher accuracy and efficiency than DHA algorithm.
arXiv Detail & Related papers (2023-09-25T14:07:27Z) - Provably Efficient Representation Learning with Tractable Planning in
Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs)
We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU)
We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z) - A Metaheuristic Algorithm for Large Maximum Weight Independent Set
Problems [58.348679046591265]
Given a node-weighted graph, find a set of independent (mutually nonadjacent) nodes whose node-weight sum is maximum.
Some of the graphs airsing in this application are large, having hundreds of thousands of nodes and hundreds of millions of edges.
We develop a new local search algorithm, which is a metaheuristic in the greedy randomized adaptive search framework.
arXiv Detail & Related papers (2022-03-28T21:34:16Z) - CITS: Coherent Ising Tree Search Algorithm Towards Solving Combinatorial
Optimization Problems [0.0]
This paper proposes a search algorithm by expanding search space from a Markov chain to a depth limited tree based on SA.
At each iteration, the algorithm will select the best near-optimal solution within the feasible search space by exploring along the tree in the sense of look ahead'
Our results show that above the primals SA and CIM, our high-level tree search strategy is able to provide solutions within fewer epochs for Ising formulated NP-optimization problems.
arXiv Detail & Related papers (2022-03-09T10:07:26Z) - Practical, Provably-Correct Interactive Learning in the Realizable
Setting: The Power of True Believers [12.09273192079783]
We consider interactive learning in the realizable setting and develop a general framework to handle problems ranging from best arm identification to active classification.
We design novel computationally efficient algorithms for the realizable setting that match the minimax lower bound up to logarithmic factors.
arXiv Detail & Related papers (2021-11-09T02:33:36Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Efficient Computation of Expectations under Spanning Tree Distributions [67.71280539312536]
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models.
Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms.
arXiv Detail & Related papers (2020-08-29T14:58:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.