Beyond Games: A Systematic Review of Neural Monte Carlo Tree Search
Applications
- URL: http://arxiv.org/abs/2303.08060v1
- Date: Tue, 14 Mar 2023 16:52:31 GMT
- Title: Beyond Games: A Systematic Review of Neural Monte Carlo Tree Search
Applications
- Authors: Marco Kemmerling, Daniel L\"utticke, Robert H. Schmitt
- Abstract summary: We review 129 articles detailing the application of neural Monte Carlo tree search methods in domains other than games.
Our goal is to systematically assess how such methods are structured in practice and if their success can be extended to other domains.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advent of AlphaGo and its successors marked the beginning of a new
paradigm in playing games using artificial intelligence. This was achieved by
combining Monte Carlo tree search, a planning procedure, and deep learning.
While the impact on the domain of games has been undeniable, it is less clear
how useful similar approaches are in applications beyond games and how they
need to be adapted from the original methodology. We review 129 peer-reviewed
articles detailing the application of neural Monte Carlo tree search methods in
domains other than games. Our goal is to systematically assess how such methods
are structured in practice and if their success can be extended to other
domains. We find applications in a variety of domains, many distinct ways of
guiding the tree search using learned policy and value functions, and various
training methods. Our review maps the current landscape of algorithms in the
family of neural monte carlo tree search as they are applied to practical
problems, which is a first step towards a more principled way of designing such
algorithms for specific problems and their requirements.
Related papers
- Improve Value Estimation of Q Function and Reshape Reward with Monte Carlo Tree Search [0.4450107621124637]
Reinforcement learning has achieved remarkable success in perfect information games such as Go and Atari.
Research in reinforcement learning for imperfect information games has been relatively limited due to the more complex game structures and randomness.
In this paper, we focus on Uno, an imperfect information game, and aim to address these problems by reducing Q value overestimation and reshaping reward function.
arXiv Detail & Related papers (2024-10-15T14:31:54Z) - LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget.
Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z) - CrossBeam: Learning to Search in Bottom-Up Program Synthesis [51.37514793318815]
We propose training a neural model to learn a hands-on search policy for bottom-up synthesis.
Our approach, called CrossBeam, uses the neural model to choose how to combine previously-explored programs into new programs.
We observe that CrossBeam learns to search efficiently, exploring much smaller portions of the program space compared to the state-of-the-art.
arXiv Detail & Related papers (2022-03-20T04:41:05Z) - An Approach for Combining Multimodal Fusion and Neural Architecture
Search Applied to Knowledge Tracing [6.540879944736641]
We propose a sequential model based optimization approach that combines multimodal fusion and neural architecture search within one framework.
We evaluate our methods on two public real datasets showing the discovered model is able to achieve superior performance.
arXiv Detail & Related papers (2021-11-08T13:43:46Z) - Meta Navigator: Search for a Good Adaptation Policy for Few-shot
Learning [113.05118113697111]
Few-shot learning aims to adapt knowledge learned from previous tasks to novel tasks with only a limited amount of labeled data.
Research literature on few-shot learning exhibits great diversity, while different algorithms often excel at different few-shot learning scenarios.
We present Meta Navigator, a framework that attempts to solve the limitation in few-shot learning by seeking a higher-level strategy.
arXiv Detail & Related papers (2021-09-13T07:20:01Z) - Monte Carlo Tree Search: A Review of Recent Modifications and
Applications [0.17205106391379024]
Monte Carlo Tree Search (MCTS) is a powerful approach to designing game-playing bots or solving sequential decision problems.
The method relies on intelligent tree search that balances exploration and exploitation.
The method has become a state-of-the-art technique for games, however, in more complex games.
arXiv Detail & Related papers (2021-03-08T17:44:15Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Monte-Carlo Graph Search for AlphaZero [15.567057178736402]
We introduce a new, improved search algorithm for AlphaZero that generalizes the search tree to a directed acyclic graph.
In our evaluations, we use the CrazyAra engine on chess and crazyhouse as examples to show that these changes bring significant improvements to AlphaZero.
arXiv Detail & Related papers (2020-12-20T22:51:38Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Learning to Stop While Learning to Predict [85.7136203122784]
Many algorithm-inspired deep models are restricted to a fixed-depth'' for all inputs.
Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances.
In this paper, we tackle this varying depth problem using a steerable architecture.
We show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks.
arXiv Detail & Related papers (2020-06-09T07:22:01Z) - Single-Agent Optimization Through Policy Iteration Using Monte-Carlo
Tree Search [8.22379888383833]
Combination of Monte-Carlo Tree Search (MCTS) and deep reinforcement learning is state-of-the-art in two-player perfect-information games.
We describe a search algorithm that uses a variant of MCTS which we enhanced by 1) a novel action value normalization mechanism for games with potentially unbounded rewards, 2) defining a virtual loss function that enables effective search parallelization, and 3) a policy network, trained by generations of self-play, to guide the search.
arXiv Detail & Related papers (2020-05-22T18:02:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.