Related papers: Evolving the MCTS Upper Confidence Bounds for Trees Using a Semantic-inspired Evolutionary Algorithm in the Game of Carcassonne

Evolving the MCTS Upper Confidence Bounds for Trees Using a Semantic-inspired Evolutionary Algorithm in the Game of Carcassonne

URL: http://arxiv.org/abs/2208.13589v1
Date: Mon, 29 Aug 2022 13:31:06 GMT
Title: Evolving the MCTS Upper Confidence Bounds for Trees Using a Semantic-inspired Evolutionary Algorithm in the Game of Carcassonne
Authors: Edgar Galv\'an, Gavin Simpson, and Fred Valdez Ameneyro
Abstract summary: We propose a Semantic-inspired Evolutionary Algorithm in Monte Carlo Tree Search (MCTS) We use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the Upper Confidence Bounds for Trees formula. We show how SIEA-MCTS is able to successfully evolve mathematical expressions that yield better or competitive results compared to UCT without the need of tuning these evolved expressions.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for optimal decisions. The success of MCTS depends heavily on how the tree is built and the selection process plays a fundamental role in this. One particular selection mechanism that has proved to be reliable is based on the Upper Confidence Bounds for Trees (UCT). The UCT attempts to balance exploration and exploitation by considering the values stored in the statistical tree of the MCTS. However, some tuning of the MCTS UCT is necessary for this to work well. In this work, we use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the UCT formula and use the evolved expressions in MCTS. More specifically, we evolve expressions by means of our proposed Semantic-inspired Evolutionary Algorithm in MCTS approach (SIEA-MCTS). This is inspired by semantics in Genetic Programming (GP), where the use of fitness cases is seen as a requirement to be adopted in GP. Fitness cases are normally used to determine the fitness of individuals and can be used to compute the semantic similarity (or dissimilarity) of individuals. However, fitness cases are not available in MCTS. We extend this notion by using multiple reward values from MCTS that allow us to determine both the fitness of an individual and its semantics. By doing so, we show how SIEA-MCTS is able to successfully evolve mathematical expressions that yield better or competitive results compared to UCT without the need of tuning these evolved expressions. We compare the performance of the proposed SIEA-MCTS against MCTS algorithms, MCTS Rapid Action Value Estimation algorithms, three variants of the *-minimax family of algorithms, a random controller and two more EA approaches. We consistently show how SIEA-MCTS outperforms most of these intelligent controllers in the challenging game of Carcassonne.

Related papers

Sample, Don't Search: Rethinking Test-Time Alignment for Language Models [55.2480439325792]
We introduce QAlign, a new test-time alignment approach. As we scale test-time compute, QAlign converges to sampling from the optimal aligned distribution for each individual prompt. By adopting recent advances in Markov chain Monte Carlo for text generation, our method enables better-aligned outputs without modifying the underlying model or even requiring logit access.
arXiv Detail & Related papers (2025-04-04T00:41:40Z)
MetaScale: Test-Time Scaling with Evolving Meta-Thoughts [51.35594569020857]
Experimental results demonstrate that MetaScale consistently outperforms standard inference approaches. METASCALE scales more effectively with increasing sampling budgets and produces more structured, expert-level responses.
arXiv Detail & Related papers (2025-03-17T17:59:54Z)
A Top-down Graph-based Tool for Modeling Classical Semantic Maps: A Crosslinguistic Case Study of Supplementary Adverbs [50.982315553104975]
Semantic map models (SMMs) construct a network-like conceptual space from cross-linguistic instances or forms. Most SMMs are manually built by human experts using bottom-up procedures. We propose a novel graph-based algorithm that automatically generates conceptual spaces and SMMs in a top-down manner.
arXiv Detail & Related papers (2024-12-02T12:06:41Z)
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path. The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z)
Optimized Monte Carlo Tree Search for Enhanced Decision Making in the FrozenLake Environment [0.0]
Monte Carlo Tree Search (MCTS) is a powerful algorithm for solving complex decision-making problems. This paper presents an optimized MCTS implementation applied to the FrozenLake environment, a classic reinforcement learning task.
arXiv Detail & Related papers (2024-09-25T05:04:53Z)
An Analysis on the Effects of Evolving the Monte Carlo Tree Search Upper Confidence for Trees Selection Policy on Unimodal, Multimodal and Deceptive Landscapes [0.0]
The Monte Carlo Tree Search (MCTS) is a best-first sampling method employed in the search for optimal decisions. A selection policy that works particularly well in MCTS is the Upper Confidence Bounds for Trees, referred to as UCT. This work explores the use of five functions of different natures, ranging from unimodal to multimodal and deceptive functions.
arXiv Detail & Related papers (2023-11-21T20:40:34Z)
Bayesian Decision Trees Inspired from Evolutionary Algorithms [64.80360020499555]
We propose a replacement of the Markov Chain Monte Carlo (MCMC) with an inherently parallel algorithm, the Sequential Monte Carlo (SMC) Experiments show that SMC combined with the Evolutionary Algorithms (EA) can produce more accurate results compared to MCMC in 100 times fewer iterations.
arXiv Detail & Related papers (2023-05-30T06:17:35Z)
Towards Understanding the Effects of Evolving the MCTS UCT Selection Policy [0.0]
Upper Confidence Bounds for Trees (UCT) is widely used in Monte Carlo Tree Search (MCTS) We show how the evolution of the UCT might be beneficial in multimodal and deceptive scenarios.
arXiv Detail & Related papers (2023-02-07T09:50:55Z)
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding Networks [65.34977803841007]
Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience. We show how by simply changing the temporal scheduling of the update rule for the synaptic weights leads to an algorithm that is much more efficient and stable than the original one.
arXiv Detail & Related papers (2022-11-16T00:11:04Z)
Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and Stability [67.8426046908398]
Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world. This paper presents MetaPG, an evolutionary method for automated design of actor-critic loss functions.
arXiv Detail & Related papers (2022-04-08T20:46:16Z)
On the Evolution of the MCTS Upper Confidence Bounds for Trees by Means of Evolutionary Algorithms in the Game of Carcassonne [0.0]
Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for optimal decisions. We use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the Upper Confidence Bounds for Trees (UCT) mathematical expression. We show how the ES-MCTS controller, is able to outperform all these 10 intelligent controllers, including robust UCT controllers.
arXiv Detail & Related papers (2021-12-17T18:06:21Z)
Monte Carlo Tree Search: A Review of Recent Modifications and Applications [0.17205106391379024]
Monte Carlo Tree Search (MCTS) is a powerful approach to designing game-playing bots or solving sequential decision problems. The method relies on intelligent tree search that balances exploration and exploitation. The method has become a state-of-the-art technique for games, however, in more complex games.
arXiv Detail & Related papers (2021-03-08T17:44:15Z)
Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK) Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework. We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z)
On Effective Parallelization of Monte Carlo Tree Search [51.15940034629022]
Monte Carlo Tree Search (MCTS) is computationally expensive as it requires a substantial number of rollouts to construct the search tree. How to design effective parallel MCTS algorithms has not been systematically studied and remains poorly understood. We demonstrate how proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.
arXiv Detail & Related papers (2020-06-15T21:36:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.