Related papers: Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions

Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions

URL: http://arxiv.org/abs/2210.12628v1
Date: Sun, 23 Oct 2022 06:39:20 GMT
Title: Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions
Authors: Weirui Ye, Pieter Abbeel, Yang Gao
Abstract summary: This paper proposes a variant of Monte-Carlo tree search (MCTS) that spends more search time on harder states and less search time on simpler states adaptively. We evaluate the performance and computations on $9 times 9$ Go board games and Atari games. Experiments show that our method can achieve comparable performances to the original search algorithm while requiring less than $50%$ search time on average.
Score: 89.89612827542972
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the most important AI research questions is to trade off computation versus performance since ``perfect rationality" exists in theory but is impossible to achieve in practice. Recently, Monte-Carlo tree search (MCTS) has attracted considerable attention due to the significant performance improvement in various challenging domains. However, the expensive time cost during search severely restricts its scope for applications. This paper proposes the Virtual MCTS (V-MCTS), a variant of MCTS that spends more search time on harder states and less search time on simpler states adaptively. We give theoretical bounds of the proposed method and evaluate the performance and computations on $9 \times 9$ Go board games and Atari games. Experiments show that our method can achieve comparable performances to the original search algorithm while requiring less than $50\%$ search time on average. We believe that this approach is a viable alternative for tasks under limited time and resources. The code is available at \url{https://github.com/YeWR/V-MCTS.git}.

Related papers

ETS: Efficient Tree Search for Inference-Time Scaling [61.553681244572914]
One promising approach for test-time compute scaling is search against a process reward model. diversity of trajectories in the tree search process affects the accuracy of the search, since increasing diversity promotes more exploration. We propose Efficient Tree Search (ETS), which promotes KV sharing by pruning redundant trajectories while maintaining necessary diverse trajectories.
arXiv Detail & Related papers (2025-02-19T09:30:38Z)
LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget. Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z)
Elastic Monte Carlo Tree Search with State Abstraction for Strategy Game Playing [58.720142291102135]
Strategy video games challenge AI agents with their search space caused by complex game elements. State abstraction is a popular technique that reduces the state space complexity. We propose Elastic MCTS, an algorithm that uses state abstraction to play strategy games.
arXiv Detail & Related papers (2022-05-30T14:18:45Z)
Fast Line Search for Multi-Task Learning [0.0]
We propose a novel idea for line search algorithms in multi-task learning. The idea is to use latent representation space instead of parameter space for finding step size. We compare this idea with classical backtracking and gradient methods with a constant learning rate on MNIST, CIFAR-10, Cityscapes tasks.
arXiv Detail & Related papers (2021-10-02T21:02:29Z)
Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search [66.34387649910046]
Monte Carlo tree search (MCTS) has achieved state-of-the-art results in many domains such as Go and Atari games. We propose to achieve this goal by predicting the uncertainty of the current searching status and use the result to decide whether we should stop searching.
arXiv Detail & Related papers (2020-12-14T19:49:25Z)
Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism [107.48538091418412]
We study exploration in multi-armed bandits when we have access to a divisible resource that can be allocated in varying amounts to arm pulls. We focus in particular on the allocation of distributed computing resources, where we may obtain results faster by allocating more resources per pull.
arXiv Detail & Related papers (2020-10-31T18:19:29Z)
On Effective Parallelization of Monte Carlo Tree Search [51.15940034629022]
Monte Carlo Tree Search (MCTS) is computationally expensive as it requires a substantial number of rollouts to construct the search tree. How to design effective parallel MCTS algorithms has not been systematically studied and remains poorly understood. We demonstrate how proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.
arXiv Detail & Related papers (2020-06-15T21:36:00Z)
Unlucky Explorer: A Complete non-Overlapping Map Exploration [0.949996206597248]
We introduce the Maze Dash puzzle as an exploration problem where the agent must find a Hamiltonian Path visiting all the cells. An optimization has been applied to the proposed Monte-Carlo Tree Search (MCTS) algorithm to obtain a promising result. Our comparison indicates that the MCTS-based approach is an up-and-coming method that could cope with the test cases with small and medium sizes.
arXiv Detail & Related papers (2020-05-28T17:19:24Z)
Single-Agent Optimization Through Policy Iteration Using Monte-Carlo Tree Search [8.22379888383833]
Combination of Monte-Carlo Tree Search (MCTS) and deep reinforcement learning is state-of-the-art in two-player perfect-information games. We describe a search algorithm that uses a variant of MCTS which we enhanced by 1) a novel action value normalization mechanism for games with potentially unbounded rewards, 2) defining a virtual loss function that enables effective search parallelization, and 3) a policy network, trained by generations of self-play, to guide the search.
arXiv Detail & Related papers (2020-05-22T18:02:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.