Evolving the MCTS Upper Confidence Bounds for Trees Using a
Semantic-inspired Evolutionary Algorithm in the Game of Carcassonne
- URL: http://arxiv.org/abs/2208.13589v1
- Date: Mon, 29 Aug 2022 13:31:06 GMT
- Title: Evolving the MCTS Upper Confidence Bounds for Trees Using a
Semantic-inspired Evolutionary Algorithm in the Game of Carcassonne
- Authors: Edgar Galv\'an, Gavin Simpson, and Fred Valdez Ameneyro
- Abstract summary: We propose a Semantic-inspired Evolutionary Algorithm in Monte Carlo Tree Search (MCTS)
We use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the Upper Confidence Bounds for Trees formula.
We show how SIEA-MCTS is able to successfully evolve mathematical expressions that yield better or competitive results compared to UCT without the need of tuning these evolved expressions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for
optimal decisions. The success of MCTS depends heavily on how the tree is built
and the selection process plays a fundamental role in this. One particular
selection mechanism that has proved to be reliable is based on the Upper
Confidence Bounds for Trees (UCT). The UCT attempts to balance exploration and
exploitation by considering the values stored in the statistical tree of the
MCTS. However, some tuning of the MCTS UCT is necessary for this to work well.
In this work, we use Evolutionary Algorithms (EAs) to evolve mathematical
expressions with the goal to substitute the UCT formula and use the evolved
expressions in MCTS. More specifically, we evolve expressions by means of our
proposed Semantic-inspired Evolutionary Algorithm in MCTS approach (SIEA-MCTS).
This is inspired by semantics in Genetic Programming (GP), where the use of
fitness cases is seen as a requirement to be adopted in GP. Fitness cases are
normally used to determine the fitness of individuals and can be used to
compute the semantic similarity (or dissimilarity) of individuals. However,
fitness cases are not available in MCTS. We extend this notion by using
multiple reward values from MCTS that allow us to determine both the fitness of
an individual and its semantics. By doing so, we show how SIEA-MCTS is able to
successfully evolve mathematical expressions that yield better or competitive
results compared to UCT without the need of tuning these evolved expressions.
We compare the performance of the proposed SIEA-MCTS against MCTS algorithms,
MCTS Rapid Action Value Estimation algorithms, three variants of the *-minimax
family of algorithms, a random controller and two more EA approaches. We
consistently show how SIEA-MCTS outperforms most of these intelligent
controllers in the challenging game of Carcassonne.
Related papers
- LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Optimized Monte Carlo Tree Search for Enhanced Decision Making in the FrozenLake Environment [0.0]
Monte Carlo Tree Search (MCTS) is a powerful algorithm for solving complex decision-making problems.
This paper presents an optimized MCTS implementation applied to the FrozenLake environment, a classic reinforcement learning task.
arXiv Detail & Related papers (2024-09-25T05:04:53Z) - An Analysis on the Effects of Evolving the Monte Carlo Tree Search Upper
Confidence for Trees Selection Policy on Unimodal, Multimodal and Deceptive
Landscapes [0.0]
The Monte Carlo Tree Search (MCTS) is a best-first sampling method employed in the search for optimal decisions.
A selection policy that works particularly well in MCTS is the Upper Confidence Bounds for Trees, referred to as UCT.
This work explores the use of five functions of different natures, ranging from unimodal to multimodal and deceptive functions.
arXiv Detail & Related papers (2023-11-21T20:40:34Z) - Bayesian Decision Trees Inspired from Evolutionary Algorithms [64.80360020499555]
We propose a replacement of the Markov Chain Monte Carlo (MCMC) with an inherently parallel algorithm, the Sequential Monte Carlo (SMC)
Experiments show that SMC combined with the Evolutionary Algorithms (EA) can produce more accurate results compared to MCMC in 100 times fewer iterations.
arXiv Detail & Related papers (2023-05-30T06:17:35Z) - Towards Understanding the Effects of Evolving the MCTS UCT Selection
Policy [0.0]
Upper Confidence Bounds for Trees (UCT) is widely used in Monte Carlo Tree Search (MCTS)
We show how the evolution of the UCT might be beneficial in multimodal and deceptive scenarios.
arXiv Detail & Related papers (2023-02-07T09:50:55Z) - A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive
Coding Networks [65.34977803841007]
Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience.
We show how by simply changing the temporal scheduling of the update rule for the synaptic weights leads to an algorithm that is much more efficient and stable than the original one.
arXiv Detail & Related papers (2022-11-16T00:11:04Z) - Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and
Stability [67.8426046908398]
Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world.
This paper presents MetaPG, an evolutionary method for automated design of actor-critic loss functions.
arXiv Detail & Related papers (2022-04-08T20:46:16Z) - On the Evolution of the MCTS Upper Confidence Bounds for Trees by Means
of Evolutionary Algorithms in the Game of Carcassonne [0.0]
Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for optimal decisions.
We use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the Upper Confidence Bounds for Trees (UCT) mathematical expression.
We show how the ES-MCTS controller, is able to outperform all these 10 intelligent controllers, including robust UCT controllers.
arXiv Detail & Related papers (2021-12-17T18:06:21Z) - Monte Carlo Tree Search: A Review of Recent Modifications and
Applications [0.17205106391379024]
Monte Carlo Tree Search (MCTS) is a powerful approach to designing game-playing bots or solving sequential decision problems.
The method relies on intelligent tree search that balances exploration and exploitation.
The method has become a state-of-the-art technique for games, however, in more complex games.
arXiv Detail & Related papers (2021-03-08T17:44:15Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z) - On Effective Parallelization of Monte Carlo Tree Search [51.15940034629022]
Monte Carlo Tree Search (MCTS) is computationally expensive as it requires a substantial number of rollouts to construct the search tree.
How to design effective parallel MCTS algorithms has not been systematically studied and remains poorly understood.
We demonstrate how proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.
arXiv Detail & Related papers (2020-06-15T21:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.