Related papers: On the Evolution of the MCTS Upper Confidence Bounds for Trees by Means of Evolutionary Algorithms in the Game of Carcassonne

On the Evolution of the MCTS Upper Confidence Bounds for Trees by Means of Evolutionary Algorithms in the Game of Carcassonne

URL: http://arxiv.org/abs/2112.09697v1
Date: Fri, 17 Dec 2021 18:06:21 GMT
Title: On the Evolution of the MCTS Upper Confidence Bounds for Trees by Means of Evolutionary Algorithms in the Game of Carcassonne
Authors: Edgar Galv\'an and Gavin Simpson
Abstract summary: Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for optimal decisions. We use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the Upper Confidence Bounds for Trees (UCT) mathematical expression. We show how the ES-MCTS controller, is able to outperform all these 10 intelligent controllers, including robust UCT controllers.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for optimal decisions. The MCTS's popularity is based on its extraordinary results in the challenging two-player based game Go, a game considered much harder than Chess and that until very recently was considered infeasible for Artificial Intelligence methods. The success of MCTS depends heavily on how the tree is built and the selection process plays a fundamental role in this. One particular selection mechanism that has proved to be reliable is based on the Upper Confidence Bounds for Trees, commonly referred as UCT. The UCT attempts to nicely balance exploration and exploitation by considering the values stored in the statistical tree of the MCTS. However, some tuning of the MCTS UCT is necessary for this to work well. In this work, we use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the UCT mathematical expression. We compare our proposed approach, called Evolution Strategy in MCTS (ES-MCTS) against five variants of the MCTS UCT, three variants of the star-minimax family of algorithms as well as a random controller in the Game of Carcassonne. We also use a variant of our proposed EA-based controller, dubbed ES partially integrated in MCTS. We show how the ES-MCTS controller, is able to outperform all these 10 intelligent controllers, including robust MCTS UCT controllers.

Related papers

MetaScale: Test-Time Scaling with Evolving Meta-Thoughts [51.35594569020857]
Experimental results demonstrate that MetaScale consistently outperforms standard inference approaches. METASCALE scales more effectively with increasing sampling budgets and produces more structured, expert-level responses.
arXiv Detail & Related papers (2025-03-17T17:59:54Z)
Can Large Language Models Play Games? A Case Study of A Self-Play Approach [61.15761840203145]
Large Language Models (LLMs) harness extensive data from the Internet, storing a broad spectrum of prior knowledge. Monte-Carlo Tree Search (MCTS) is a search algorithm that provides reliable decision-making solutions. This work introduces an innovative approach that bolsters LLMs with MCTS self-play to efficiently resolve turn-based zero-sum games.
arXiv Detail & Related papers (2024-03-08T19:16:29Z)
An Analysis on the Effects of Evolving the Monte Carlo Tree Search Upper Confidence for Trees Selection Policy on Unimodal, Multimodal and Deceptive Landscapes [0.0]
The Monte Carlo Tree Search (MCTS) is a best-first sampling method employed in the search for optimal decisions. A selection policy that works particularly well in MCTS is the Upper Confidence Bounds for Trees, referred to as UCT. This work explores the use of five functions of different natures, ranging from unimodal to multimodal and deceptive functions.
arXiv Detail & Related papers (2023-11-21T20:40:34Z)
Bayesian Decision Trees Inspired from Evolutionary Algorithms [64.80360020499555]
We propose a replacement of the Markov Chain Monte Carlo (MCMC) with an inherently parallel algorithm, the Sequential Monte Carlo (SMC) Experiments show that SMC combined with the Evolutionary Algorithms (EA) can produce more accurate results compared to MCMC in 100 times fewer iterations.
arXiv Detail & Related papers (2023-05-30T06:17:35Z)
Proof Number Based Monte-Carlo Tree Search [1.93674821880689]
This paper proposes a new game-search algorithm, PN-MCTS, which combines Monte-Carlo Tree Search (MCTS) and Proof-Number Search (PNS) We define three areas where the additional knowledge provided by the proof and disproof numbers gathered in MCTS trees might be used. Experiments show that PN-MCTS is able to outperform MCTS in all tested game domains, achieving win rates up to 96.2% for Lines of Action.
arXiv Detail & Related papers (2023-03-16T16:27:07Z)
Towards Understanding the Effects of Evolving the MCTS UCT Selection Policy [0.0]
Upper Confidence Bounds for Trees (UCT) is widely used in Monte Carlo Tree Search (MCTS) We show how the evolution of the UCT might be beneficial in multimodal and deceptive scenarios.
arXiv Detail & Related papers (2023-02-07T09:50:55Z)
Evolving the MCTS Upper Confidence Bounds for Trees Using a Semantic-inspired Evolutionary Algorithm in the Game of Carcassonne [0.0]
We propose a Semantic-inspired Evolutionary Algorithm in Monte Carlo Tree Search (MCTS) We use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the Upper Confidence Bounds for Trees formula. We show how SIEA-MCTS is able to successfully evolve mathematical expressions that yield better or competitive results compared to UCT without the need of tuning these evolved expressions.
arXiv Detail & Related papers (2022-08-29T13:31:06Z)
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions [145.54544979467872]
We propose and analyze new fictitious play policy optimization algorithms for zero-sum Markov games with structured but unknown transitions. We prove tight $widetildemathcalO(sqrtK)$ regret bounds after $K$ episodes in a two-agent competitive game scenario. Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization.
arXiv Detail & Related papers (2022-07-25T18:29:16Z)
Monte Carlo Tree Search for high precision manufacturing [55.60116686945561]
We make use of an expert-based simulator and adapt the MCTS default policy to deal with the manufacturing process. Common reasons for this are that there is no efficient simulator of the process available or there exist problems in applying MCTS to the complex rules of the process.
arXiv Detail & Related papers (2021-07-28T14:56:17Z)
Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction [63.595545216327245]
We tackle two major challenges with Tree Search (TS) We first discover and analyze a counter-intuitive phenomenon: action selection through TS and a pre-trained value function often leads to lower performance compared to the original pre-trained agent. We introduce Batch-BFS: a GPU breadth-first search that advances all nodes in each depth of the tree simultaneously.
arXiv Detail & Related papers (2021-07-04T19:32:24Z)
Playing Carcassonne with Monte Carlo Tree Search [0.0]
We explore the use of the vanilla Monte Carlo Tree Search (MCTS) and the Rapid Action Value Estimation (MCTS-RAVE) in the game of Carcassonne. We compare the strengths of the MCTS-based methods with the Star2.5 algorithm, previously reported to yield competitive results in the game of Carcassonne.
arXiv Detail & Related papers (2020-09-27T22:35:53Z)
On Effective Parallelization of Monte Carlo Tree Search [51.15940034629022]
Monte Carlo Tree Search (MCTS) is computationally expensive as it requires a substantial number of rollouts to construct the search tree. How to design effective parallel MCTS algorithms has not been systematically studied and remains poorly understood. We demonstrate how proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.
arXiv Detail & Related papers (2020-06-15T21:36:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.