Related papers: Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces

Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces

URL: http://arxiv.org/abs/2106.15594v1
Date: Tue, 29 Jun 2021 17:30:01 GMT
Title: Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces
Authors: Ricardo Quinteiro, Francisco S. Melo, Pedro A. Santos
Abstract summary: We propose LD-HOO, a limited depth variant of the hierarchical optimization (HOO) algorithm. We show that our algorithm exhibits the same cumulative regret as the original HOO while being faster and more memory efficient. We then propose a Monte Carlo tree search algorithm based on LD-HOO for optimal control problems and illustrate the resulting approach's application in several optimal control problems.
Score: 4.1208902102156015
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses the problem of optimal control using search trees. We start by considering multi-armed bandit problems with continuous action spaces and propose LD-HOO, a limited depth variant of the hierarchical optimistic optimization (HOO) algorithm. We provide a regret analysis for LD-HOO and show that, asymptotically, our algorithm exhibits the same cumulative regret as the original HOO while being faster and more memory efficient. We then propose a Monte Carlo tree search algorithm based on LD-HOO for optimal control problems and illustrate the resulting approach's application in several optimal control problems.

Related papers

LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget. Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z)
Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits [55.938644481736446]
Indexed Minimum Empirical Divergence (IMED) is a highly effective approach to the multi-armed bandit problem. It has been observed to empirically outperform UCB-based algorithms and Thompson Sampling. We present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms.
arXiv Detail & Related papers (2024-05-24T04:11:58Z)
An Optimal Algorithm for the Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit [65.268245109828]
We study the real-valued pure exploration problem in the multi-armed bandit (R-CPE-MAB) Existing methods in the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits. We propose an algorithm named the gap-based exploration (CombGapE) algorithm, whose sample complexity matches the lower bound.
arXiv Detail & Related papers (2023-06-15T15:37:31Z)
Decision Diagram-Based Branch-and-Bound with Caching for Dominance and Suboptimality Detection [9.175779296469194]
This paper presents new ingredients to speed up the search by exploiting the structure of dynamic programming models. The key idea is to prevent the repeated expansion of nodes corresponding to the same dynamic programming states by querying expansion thresholds cached throughout the search. Experiments show that the pruning brought by this caching mechanism allows significantly reducing the number of nodes expanded by the algorithm.
arXiv Detail & Related papers (2022-11-22T10:18:33Z)
Monte Carlo Tree Descent for Black-Box Optimization [10.698553177585973]
We study how to further integrate sample-based descent for faster optimization. We design novel ways of expanding Monte Carlo search trees, with new descent methods at vertices. We show empirically that the proposed algorithms can outperform state-of-the-art methods on many challenging benchmark problems.
arXiv Detail & Related papers (2022-11-01T22:45:10Z)
CITS: Coherent Ising Tree Search Algorithm Towards Solving Combinatorial Optimization Problems [0.0]
This paper proposes a search algorithm by expanding search space from a Markov chain to a depth limited tree based on SA. At each iteration, the algorithm will select the best near-optimal solution within the feasible search space by exploring along the tree in the sense of look ahead' Our results show that above the primals SA and CIM, our high-level tree search strategy is able to provide solutions within fewer epochs for Ising formulated NP-optimization problems.
arXiv Detail & Related papers (2022-03-09T10:07:26Z)
Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network. We design two optimal algorithms that attain these lower bounds. We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions. We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)
Bayesian Optimized Monte Carlo Planning [34.8909579244631]
Monte Carlo tree search with progressive widening attempts to improve scaling by sampling from the action space to construct a policy search tree. We present a general method for efficient action sampling based on Bayesian optimization. We implement the proposed approach in a new online tree search algorithm called Bayesian Optimized Monte Carlo Planning.
arXiv Detail & Related papers (2020-10-07T18:29:27Z)
Adaptive Sampling for Best Policy Identification in Markov Decision Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model. The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.