Adaptive Discretization using Voronoi Trees for Continuous POMDPs
- URL: http://arxiv.org/abs/2302.10439v1
- Date: Tue, 21 Feb 2023 04:47:34 GMT
- Title: Adaptive Discretization using Voronoi Trees for Continuous POMDPs
- Authors: Marcus Hoerger, Hanna Kurniawati, Dirk Kroese, Nan Ye
- Abstract summary: We propose a new sampling-based online POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT)
It uses Monte Carlo Tree Search in combination with an adaptive discretization of the action space as well as optimistic optimization to efficiently sample high-dimensional continuous action spaces.
ADVT scales substantially better to high-dimensional continuous action spaces, compared to state-of-the-art methods.
- Score: 7.713622698801596
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving continuous Partially Observable Markov Decision Processes (POMDPs) is
challenging, particularly for high-dimensional continuous action spaces. To
alleviate this difficulty, we propose a new sampling-based online POMDP solver,
called Adaptive Discretization using Voronoi Trees (ADVT). It uses Monte Carlo
Tree Search in combination with an adaptive discretization of the action space
as well as optimistic optimization to efficiently sample high-dimensional
continuous action spaces and compute the best action to perform. Specifically,
we adaptively discretize the action space for each sampled belief using a
hierarchical partition called Voronoi tree, which is a Binary Space
Partitioning that implicitly maintains the partition of a cell as the Voronoi
diagram of two points sampled from the cell. ADVT uses the estimated diameters
of the cells to form an upper-confidence bound on the action value function
within the cell, guiding the Monte Carlo Tree Search expansion and further
discretization of the action space. This enables ADVT to better exploit local
information with respect to the action value function, allowing faster
identification of the most promising regions in the action space, compared to
existing solvers. Voronoi trees keep the cost of partitioning and estimating
the diameter of each cell low, even in high-dimensional spaces where many
sampled points are required to cover the space well. ADVT additionally handles
continuous observation spaces, by adopting an observation progressive widening
strategy, along with a weighted particle representation of beliefs.
Experimental results indicate that ADVT scales substantially better to
high-dimensional continuous action spaces, compared to state-of-the-art
methods.
Related papers
- Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model [66.91323540178739]
Sequential recommendation (SR) aims to predict items that users may be interested in based on their historical behavior.
We revisit SR from a novel information-theoretic perspective and find that sequential modeling methods fail to adequately capture randomness and unpredictability of user behavior.
Inspired by fuzzy information processing theory, this paper introduces the fuzzy sets of interaction sequences to overcome the limitations and better capture the evolution of users' real interests.
arXiv Detail & Related papers (2024-10-31T14:52:01Z) - Sample-and-Bound for Non-Convex Optimization [18.30858789210194]
We propose new sampling methods for non-dimensional objective optimization that adapts Monte Carlo benchmarks to improve efficiency.
We evaluate the proposed high-order baseline and competitive benchmarks algorithms aggressively.
arXiv Detail & Related papers (2024-01-09T20:45:47Z) - Optimal Scaling for Locally Balanced Proposals in Discrete Spaces [65.14092237705476]
We show that efficiency of Metropolis-Hastings (M-H) algorithms in discrete spaces can be characterized by an acceptance rate that is independent of the target distribution.
Knowledge of the optimal acceptance rate allows one to automatically tune the neighborhood size of a proposal distribution in a discrete space, directly analogous to step-size control in continuous spaces.
arXiv Detail & Related papers (2022-09-16T22:09:53Z) - Adaptive Discretization using Voronoi Trees for Continuous-Action POMDPs [7.713622698801596]
We propose a new sampling-based online POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT)
ADVT uses Monte Carlo Tree Search in combination with an adaptive discretization of the action space as well as optimistic optimization.
Experiments on simulations of four types of benchmark problems indicate that ADVT outperforms and scales substantially better to high-dimensional continuous action spaces.
arXiv Detail & Related papers (2022-09-13T05:04:49Z) - Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications.
We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z) - Computationally Efficient PAC RL in POMDPs with Latent Determinism and
Conditional Embeddings [97.12538243736705]
We study reinforcement learning with function approximation for large-scale Partially Observable Decision Processes (POMDPs)
Our algorithm provably scales to large-scale POMDPs.
arXiv Detail & Related papers (2022-06-24T05:13:35Z) - Learning Space Partitions for Path Planning [54.475949279050596]
PlaLaM outperforms existing path planning methods in 2D navigation tasks, especially in the presence of difficult-to-escape local optima.
These gains transfer to highly multimodal real-world tasks, where we outperform strong baselines in compiler phase ordering by up to 245% and in molecular design by up to 0.4 on properties on a 0-1 scale.
arXiv Detail & Related papers (2021-06-19T18:06:11Z) - Dream and Search to Control: Latent Space Planning for Continuous
Control [24.991127785736364]
We show that it is possible to demonstrate the types of bootstrapping benefits as previously shown for discrete spaces.
In particular, the approach achieves improved sample efficiency and performance on a majority of challenging continuous-control benchmarks.
arXiv Detail & Related papers (2020-10-19T20:10:51Z) - Improved POMDP Tree Search Planning with Prioritized Action Branching [33.94599291823342]
This paper proposes a method called PA-POMCPOW to sample a subset of the action space that provides varying mixtures of exploitation and exploration for inclusion in a search tree.
Experiments show that PA-POMCPOW is able to outperform existing state-of-the-art solvers on problems with large discrete action spaces.
arXiv Detail & Related papers (2020-10-07T18:33:57Z) - Bayesian Optimized Monte Carlo Planning [34.8909579244631]
Monte Carlo tree search with progressive widening attempts to improve scaling by sampling from the action space to construct a policy search tree.
We present a general method for efficient action sampling based on Bayesian optimization.
We implement the proposed approach in a new online tree search algorithm called Bayesian Optimized Monte Carlo Planning.
arXiv Detail & Related papers (2020-10-07T18:29:27Z) - Joint and Progressive Subspace Analysis (JPSA) with Spatial-Spectral
Manifold Alignment for Semi-Supervised Hyperspectral Dimensionality Reduction [48.73525876467408]
We propose a novel technique for hyperspectral subspace analysis.
The technique is called joint and progressive subspace analysis (JPSA)
Experiments are conducted to demonstrate the superiority and effectiveness of the proposed JPSA on two widely-used hyperspectral datasets.
arXiv Detail & Related papers (2020-09-21T16:29:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.