Related papers: Amplifying Exploration in Monte-Carlo Tree Search by Focusing on the Unknown

Amplifying Exploration in Monte-Carlo Tree Search by Focusing on the Unknown

URL: http://arxiv.org/abs/2402.08511v1
Date: Tue, 13 Feb 2024 15:05:54 GMT
Title: Amplifying Exploration in Monte-Carlo Tree Search by Focusing on the Unknown
Authors: Cedric Derstroff, Jannis Brugger, Jannis Bl\"uml, Mira Mezini, Stefan Kramer, Kristian Kersting
Abstract summary: Monte-Carlo tree search (MCTS) strategically allocates computational resources to focus on promising segments of the search tree. Our proposed methodology, denoted as AmEx-MCTS, solves this problem by introducing a novel MCTS formulation. Our empirical evaluation demonstrates the superior performance of AmEx-MCTS, surpassing classical MCTS and related approaches by a substantial margin.
Score: 19.664506834858244
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Monte-Carlo tree search (MCTS) is an effective anytime algorithm with a vast amount of applications. It strategically allocates computational resources to focus on promising segments of the search tree, making it a very attractive search algorithm in large search spaces. However, it often expends its limited resources on reevaluating previously explored regions when they remain the most promising path. Our proposed methodology, denoted as AmEx-MCTS, solves this problem by introducing a novel MCTS formulation. Central to AmEx-MCTS is the decoupling of value updates, visit count updates, and the selected path during the tree search, thereby enabling the exclusion of already explored subtrees or leaves. This segregation preserves the utility of visit counts for both exploration-exploitation balancing and quality metrics within MCTS. The resultant augmentation facilitates in a considerably broader search using identical computational resources, preserving the essential characteristics of MCTS. The expanded coverage not only yields more precise estimations but also proves instrumental in larger and more complex problems. Our empirical evaluation demonstrates the superior performance of AmEx-MCTS, surpassing classical MCTS and related approaches by a substantial margin.

Related papers

Exploring Explainable Multi-player MCTS-minimax Hybrids in Board Game Using Process Mining [3.5042452314350716]
This paper presents our ongoing investigation into potential explanations for the decision-making and behavior of Monte-Carlo Tree Search (MCTS) A weakness of MCTS is that it constructs a highly selective tree and, as a result, can miss crucial moves and fall into tactical traps. We integrate shallow minimax search into the rollout phase of multi-player MCTS and use process mining technique to explain agents' strategies in 3v3 checkers.
arXiv Detail & Related papers (2025-03-30T05:48:53Z)
Holistically Guided Monte Carlo Tree Search for Intricate Information Seeking [118.3983437282541]
We introduce an LLM-based search assistant that adopts a new information seeking paradigm with holistically guided Monte Carlo tree search (HG-MCTS) We reformulate the task as a progressive information collection process with a knowledge memory and unite an adaptive checklist with multi-perspective reward modeling in MCTS. Our multi-perspective reward modeling offers both exploration and retrieval rewards, along with progress feedback that tracks completed and remaining sub-goals.
arXiv Detail & Related papers (2025-02-07T08:36:39Z)
Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search [95.06503095273395]
o1-like reasoning approach is challenging, and researchers have been making various attempts to advance this open area of research. We present a preliminary exploration into enhancing the reasoning abilities of LLMs through reward-guided tree search algorithms.
arXiv Detail & Related papers (2024-11-18T16:15:17Z)
RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation [65.5353313491402]
We introduce RethinkMCTS, which employs the Monte Carlo Tree Search (MCTS) algorithm to conduct thought-level searches before generating code. We construct verbal feedback from fine-turbo code execution feedback to refine erroneous thoughts during the search. We demonstrate that RethinkMCTS outperforms previous search-based and feedback-based code generation baselines.
arXiv Detail & Related papers (2024-09-15T02:07:28Z)
Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization [18.25487451605638]
We derive a tree search algorithm based on policy optimization with state occupancy measure regularization, which we call it Volume-MCTS We show that count-based exploration and sampling-based motion planning can be derived as approximate solutions to this state occupancy measure regularized objective. We test our method on several robot navigation problems, and find that Volume-MCTS outperforms AlphaZero and displays significantly better long-horizon exploration properties.
arXiv Detail & Related papers (2024-07-07T22:58:52Z)
LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget. Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z)
Autonomous Tree-search Ability of Large Language Models [58.68735916408101]
Large Language Models have excelled in remarkable reasoning capabilities with advanced prompting techniques. Recent works propose to utilize external programs to define search logic, such that LLMs can perform passive tree search to solve more challenging reasoning tasks. We propose a new concept called autonomous tree-search ability of LLM, which can automatically generate a response containing search trajectories for the correct answer.
arXiv Detail & Related papers (2023-10-14T14:14:38Z)
Monte-Carlo Tree Search for Multi-Agent Pathfinding: Preliminary Results [60.4817465598352]
We introduce an original variant of Monte-Carlo Tree Search (MCTS) tailored to multi-agent pathfinding. Specifically, we use individual paths to assist the agents with the the goal-reaching behavior. We also use a dedicated decomposition technique to reduce the branching factor of the tree search procedure.
arXiv Detail & Related papers (2023-07-25T12:33:53Z)
Epistemic Monte Carlo Tree Search [5.624791703748109]
We introduce Epistemic MCTS (EMCTS) to account for the uncertainty in search and harness the search for deep exploration. In the challenging sparse-reward task of writing code in the Assembly language SUBLEQ, AZ paired with our method achieves significantly higher sample efficiency over baseline AZ.
arXiv Detail & Related papers (2022-10-21T09:59:15Z)
Continuous Monte Carlo Graph Search [61.11769232283621]
Continuous Monte Carlo Graph Search ( CMCGS) is an extension of Monte Carlo Tree Search (MCTS) to online planning. CMCGS takes advantage of the insight that, during planning, sharing the same action policy between several states can yield high performance. It can be scaled up through parallelization, and it outperforms the Cross-Entropy Method (CEM) in continuous control with learned dynamics models.
arXiv Detail & Related papers (2022-10-04T07:34:06Z)
Unlucky Explorer: A Complete non-Overlapping Map Exploration [0.949996206597248]
We introduce the Maze Dash puzzle as an exploration problem where the agent must find a Hamiltonian Path visiting all the cells. An optimization has been applied to the proposed Monte-Carlo Tree Search (MCTS) algorithm to obtain a promising result. Our comparison indicates that the MCTS-based approach is an up-and-coming method that could cope with the test cases with small and medium sizes.
arXiv Detail & Related papers (2020-05-28T17:19:24Z)
The Second Type of Uncertainty in Monte Carlo Tree Search [2.564530030795554]
Monte Carlo Tree Search (MCTS) efficiently balances exploration and exploitation in tree search based on count-derived uncertainty. We first show how, due to the lack of this second uncertainty type, MCTS may completely fail in well-known sparse exploration problems. We then introduce a new algorithm, which estimates the size of the subtree below an action, and leverages this information in the UCB formula to better direct exploration.
arXiv Detail & Related papers (2020-05-19T09:10:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.