Adaptive Discretization using Voronoi Trees for Continuous-Action POMDPs
- URL: http://arxiv.org/abs/2209.05733v1
- Date: Tue, 13 Sep 2022 05:04:49 GMT
- Title: Adaptive Discretization using Voronoi Trees for Continuous-Action POMDPs
- Authors: Marcus Hoerger, Hanna Kurniawati, Dirk Kroese, Nan Ye
- Abstract summary: We propose a new sampling-based online POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT)
ADVT uses Monte Carlo Tree Search in combination with an adaptive discretization of the action space as well as optimistic optimization.
Experiments on simulations of four types of benchmark problems indicate that ADVT outperforms and scales substantially better to high-dimensional continuous action spaces.
- Score: 7.713622698801596
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving Partially Observable Markov Decision Processes (POMDPs) with
continuous actions is challenging, particularly for high-dimensional action
spaces. To alleviate this difficulty, we propose a new sampling-based online
POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT). It
uses Monte Carlo Tree Search in combination with an adaptive discretization of
the action space as well as optimistic optimization to efficiently sample
high-dimensional continuous action spaces and compute the best action to
perform. Specifically, we adaptively discretize the action space for each
sampled belief using a hierarchical partition which we call a Voronoi tree. A
Voronoi tree is a Binary Space Partitioning (BSP) that implicitly maintains the
partition of a cell as the Voronoi diagram of two points sampled from the cell.
This partitioning strategy keeps the cost of partitioning and estimating the
size of each cell low, even in high-dimensional spaces where many sampled
points are required to cover the space well. ADVT uses the estimated sizes of
the cells to form an upper-confidence bound of the action values of the cell,
and in turn uses the upper-confidence bound to guide the Monte Carlo Tree
Search expansion and further discretization of the action space. This strategy
enables ADVT to better exploit local information in the action space, leading
to an action space discretization that is more adaptive, and hence more
efficient in computing good POMDP solutions, compared to existing solvers.
Experiments on simulations of four types of benchmark problems indicate that
ADVT outperforms and scales substantially better to high-dimensional continuous
action spaces, compared to state-of-the-art continuous action POMDP solvers.
Related papers
- Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model [66.91323540178739]
Sequential recommendation (SR) aims to predict items that users may be interested in based on their historical behavior.
We revisit SR from a novel information-theoretic perspective and find that sequential modeling methods fail to adequately capture randomness and unpredictability of user behavior.
Inspired by fuzzy information processing theory, this paper introduces the fuzzy sets of interaction sequences to overcome the limitations and better capture the evolution of users' real interests.
arXiv Detail & Related papers (2024-10-31T14:52:01Z) - Sample-and-Bound for Non-Convex Optimization [18.30858789210194]
We propose new sampling methods for non-dimensional objective optimization that adapts Monte Carlo benchmarks to improve efficiency.
We evaluate the proposed high-order baseline and competitive benchmarks algorithms aggressively.
arXiv Detail & Related papers (2024-01-09T20:45:47Z) - Tree-Planner: Efficient Close-loop Task Planning with Large Language Models [63.06270302774049]
Tree-Planner reframes task planning with Large Language Models into three distinct phases.
Tree-Planner achieves state-of-the-art performance while maintaining high efficiency.
arXiv Detail & Related papers (2023-10-12T17:59:50Z) - Adaptive Discretization using Voronoi Trees for Continuous POMDPs [7.713622698801596]
We propose a new sampling-based online POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT)
It uses Monte Carlo Tree Search in combination with an adaptive discretization of the action space as well as optimistic optimization to efficiently sample high-dimensional continuous action spaces.
ADVT scales substantially better to high-dimensional continuous action spaces, compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-02-21T04:47:34Z) - Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications.
We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z) - Tree ensemble kernels for Bayesian optimization with known constraints
over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search.
Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function.
Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z) - Learning Space Partitions for Path Planning [54.475949279050596]
PlaLaM outperforms existing path planning methods in 2D navigation tasks, especially in the presence of difficult-to-escape local optima.
These gains transfer to highly multimodal real-world tasks, where we outperform strong baselines in compiler phase ordering by up to 245% and in molecular design by up to 0.4 on properties on a 0-1 scale.
arXiv Detail & Related papers (2021-06-19T18:06:11Z) - Improved POMDP Tree Search Planning with Prioritized Action Branching [33.94599291823342]
This paper proposes a method called PA-POMCPOW to sample a subset of the action space that provides varying mixtures of exploitation and exploration for inclusion in a search tree.
Experiments show that PA-POMCPOW is able to outperform existing state-of-the-art solvers on problems with large discrete action spaces.
arXiv Detail & Related papers (2020-10-07T18:33:57Z) - Bayesian Optimized Monte Carlo Planning [34.8909579244631]
Monte Carlo tree search with progressive widening attempts to improve scaling by sampling from the action space to construct a policy search tree.
We present a general method for efficient action sampling based on Bayesian optimization.
We implement the proposed approach in a new online tree search algorithm called Bayesian Optimized Monte Carlo Planning.
arXiv Detail & Related papers (2020-10-07T18:29:27Z) - Supervised Hyperalignment for multi-subject fMRI data alignment [81.8694682249097]
This paper proposes a Supervised Hyperalignment (SHA) method to ensure better functional alignment for MVP analysis.
Experiments on multi-subject datasets demonstrate that SHA method achieves up to 19% better performance for multi-class problems.
arXiv Detail & Related papers (2020-01-09T09:17:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.