Related papers: Beyond Games: A Systematic Review of Neural Monte Carlo Tree Search Applications

Beyond Games: A Systematic Review of Neural Monte Carlo Tree Search Applications

URL: http://arxiv.org/abs/2303.08060v1
Date: Tue, 14 Mar 2023 16:52:31 GMT
Title: Beyond Games: A Systematic Review of Neural Monte Carlo Tree Search Applications
Authors: Marco Kemmerling, Daniel L\"utticke, Robert H. Schmitt
Abstract summary: We review 129 articles detailing the application of neural Monte Carlo tree search methods in domains other than games. Our goal is to systematically assess how such methods are structured in practice and if their success can be extended to other domains.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The advent of AlphaGo and its successors marked the beginning of a new paradigm in playing games using artificial intelligence. This was achieved by combining Monte Carlo tree search, a planning procedure, and deep learning. While the impact on the domain of games has been undeniable, it is less clear how useful similar approaches are in applications beyond games and how they need to be adapted from the original methodology. We review 129 peer-reviewed articles detailing the application of neural Monte Carlo tree search methods in domains other than games. Our goal is to systematically assess how such methods are structured in practice and if their success can be extended to other domains. We find applications in a variety of domains, many distinct ways of guiding the tree search using learned policy and value functions, and various training methods. Our review maps the current landscape of algorithms in the family of neural monte carlo tree search as they are applied to practical problems, which is a first step towards a more principled way of designing such algorithms for specific problems and their requirements.

Related papers

Subgoal-Guided Policy Heuristic Search with Learned Subgoals [15.570621284198015]
Policy tree search is a family of tree search algorithms that use a policy to guide the search.<n>This paper introduces a novel method for learning subgoal-based policies for policy tree search algorithms.
arXiv Detail & Related papers (2025-06-08T18:45:43Z)
Improve Value Estimation of Q Function and Reshape Reward with Monte Carlo Tree Search [0.4450107621124637]
Reinforcement learning has achieved remarkable success in perfect information games such as Go and Atari. Research in reinforcement learning for imperfect information games has been relatively limited due to the more complex game structures and randomness. In this paper, we focus on Uno, an imperfect information game, and aim to address these problems by reducing Q value overestimation and reshaping reward function.
arXiv Detail & Related papers (2024-10-15T14:31:54Z)
LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget. Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z)
CrossBeam: Learning to Search in Bottom-Up Program Synthesis [51.37514793318815]
We propose training a neural model to learn a hands-on search policy for bottom-up synthesis. Our approach, called CrossBeam, uses the neural model to choose how to combine previously-explored programs into new programs. We observe that CrossBeam learns to search efficiently, exploring much smaller portions of the program space compared to the state-of-the-art.
arXiv Detail & Related papers (2022-03-20T04:41:05Z)
An Approach for Combining Multimodal Fusion and Neural Architecture Search Applied to Knowledge Tracing [6.540879944736641]
We propose a sequential model based optimization approach that combines multimodal fusion and neural architecture search within one framework. We evaluate our methods on two public real datasets showing the discovered model is able to achieve superior performance.
arXiv Detail & Related papers (2021-11-08T13:43:46Z)
Meta Navigator: Search for a Good Adaptation Policy for Few-shot Learning [113.05118113697111]
Few-shot learning aims to adapt knowledge learned from previous tasks to novel tasks with only a limited amount of labeled data. Research literature on few-shot learning exhibits great diversity, while different algorithms often excel at different few-shot learning scenarios. We present Meta Navigator, a framework that attempts to solve the limitation in few-shot learning by seeking a higher-level strategy.
arXiv Detail & Related papers (2021-09-13T07:20:01Z)
Monte Carlo Tree Search: A Review of Recent Modifications and Applications [0.17205106391379024]
Monte Carlo Tree Search (MCTS) is a powerful approach to designing game-playing bots or solving sequential decision problems. The method relies on intelligent tree search that balances exploration and exploitation. The method has become a state-of-the-art technique for games, however, in more complex games.
arXiv Detail & Related papers (2021-03-08T17:44:15Z)
Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z)
Monte-Carlo Graph Search for AlphaZero [15.567057178736402]
We introduce a new, improved search algorithm for AlphaZero that generalizes the search tree to a directed acyclic graph. In our evaluations, we use the CrazyAra engine on chess and crazyhouse as examples to show that these changes bring significant improvements to AlphaZero.
arXiv Detail & Related papers (2020-12-20T22:51:38Z)
MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
Learning to Stop While Learning to Predict [85.7136203122784]
Many algorithm-inspired deep models are restricted to a fixed-depth'' for all inputs. Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances. In this paper, we tackle this varying depth problem using a steerable architecture. We show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks.
arXiv Detail & Related papers (2020-06-09T07:22:01Z)
Single-Agent Optimization Through Policy Iteration Using Monte-Carlo Tree Search [8.22379888383833]
Combination of Monte-Carlo Tree Search (MCTS) and deep reinforcement learning is state-of-the-art in two-player perfect-information games. We describe a search algorithm that uses a variant of MCTS which we enhanced by 1) a novel action value normalization mechanism for games with potentially unbounded rewards, 2) defining a virtual loss function that enables effective search parallelization, and 3) a policy network, trained by generations of self-play, to guide the search.
arXiv Detail & Related papers (2020-05-22T18:02:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.