Related papers: Adaptive Belief Discretization for POMDP Planning

Adaptive Belief Discretization for POMDP Planning

URL: http://arxiv.org/abs/2104.07276v1
Date: Thu, 15 Apr 2021 07:04:32 GMT
Title: Adaptive Belief Discretization for POMDP Planning
Authors: Divya Grover, Christos Dimitrakakis
Abstract summary: Many POMDP solvers uniformly discretize the belief space and give the planning error in terms of the (typically unknown) covering number. We propose an adaptive belief discretization scheme, and give its associated planning error. We demonstrate that our algorithm is highly competitive with the state of the art in a variety of scenarios.
Score: 7.508023795800546
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Partially Observable Markov Decision Processes (POMDP) is a widely used model to represent the interaction of an environment and an agent, under state uncertainty. Since the agent does not observe the environment state, its uncertainty is typically represented through a probabilistic belief. While the set of possible beliefs is infinite, making exact planning intractable, the belief space's complexity (and hence planning complexity) is characterized by its covering number. Many POMDP solvers uniformly discretize the belief space and give the planning error in terms of the (typically unknown) covering number. We instead propose an adaptive belief discretization scheme, and give its associated planning error. We furthermore characterize the covering number with respect to the POMDP parameters. This allows us to specify the exact memory requirements on the planner, needed to bound the value function error. We then propose a novel, computationally efficient solver using this scheme. We demonstrate that our algorithm is highly competitive with the state of the art in a variety of scenarios.

Related papers

Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver. We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications. We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z)
Prospective Side Information for Latent MDPs [80.00842638151558]
We study the class of LMDPs with em prospective side information, when an agent receives additional, weakly revealing, information at the beginning of each episode. We show that, surprisingly, this problem is not captured by contemporary settings and algorithms designed for partially observed environments. We then establish that any sample efficient algorithm must suffer at least $Omega(K2/3)$-regret, as opposed to standard $Omega(sqrtK)$ lower bounds, and design an algorithm with a matching upper bound.
arXiv Detail & Related papers (2023-10-11T15:37:31Z)
Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners [85.03486419424647]
KnowNo is a framework for measuring and aligning the uncertainty of large language models. KnowNo builds on the theory of conformal prediction to provide statistical guarantees on task completion.
arXiv Detail & Related papers (2023-07-04T21:25:12Z)
Simplified Continuous High Dimensional Belief Space Planning with Adaptive Probabilistic Belief-dependent Constraints [9.061408029414453]
Under uncertainty in partially observable domains, also known as Belief Space Planning, online decision making is a fundamental problem. We present a technique to adaptively accept or discard a candidate action sequence with respect to a probabilistic belief-dependent constraint. We apply our method to active SLAM, a highly challenging problem of high dimensional Belief Space Planning.
arXiv Detail & Related papers (2023-02-13T21:22:47Z)
Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds. We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z)
Adaptive Information Belief Space Planning [9.365993173260316]
We focus on making informed decisions efficiently, using reward functions that explicitly deal with uncertainty. We derive bounds on the expected information-theoretic reward function and, as a consequence, on the value function. We then propose a method to refine aggregation to achieve identical action selection with a fraction of the computational time.
arXiv Detail & Related papers (2022-01-14T21:12:00Z)
Planning in Observable POMDPs in Quasipolynomial Time [21.03037504572896]
We develop a quasipolynomial-time algorithm for planning in observable POMDPs. We assume that well-separated distributions on states lead to well-separated distributions on observations. We prove matching hardness for planning in observable POMDPs under the Exponential Time Hypothesis.
arXiv Detail & Related papers (2022-01-12T23:16:37Z)
Efficient Belief Space Planning in High-Dimensional State Spaces using PIVOT: Predictive Incremental Variable Ordering Tactic [11.878820609988693]
We examine the problem of online decision making under uncertainty, which we formulate as planning in the belief space. We call this approach PIVOT: Predictive Incremental Variable Ordering Tactic. Applying this tactic can also improve the state inference efficiency.
arXiv Detail & Related papers (2021-12-29T07:30:47Z)
Minimax Regret Optimisation for Robust Planning in Uncertain Markov Decision Processes [3.5289688061934963]
Minimax regret has been proposed as an objective for planning in Uncertain MDPs to find robust policies. We introduce a Bellman equation to compute the regret for a policy. We show that it optimises minimax regret exactly for UMDPs with independent uncertainties.
arXiv Detail & Related papers (2020-12-08T18:48:14Z)
Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems. Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions. We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z)
Exploiting Submodular Value Functions For Scaling Up Active Perception [60.81276437097671]
In active perception tasks, agent aims to select sensory actions that reduce uncertainty about one or more hidden variables. Partially observable Markov decision processes (POMDPs) provide a natural model for such problems. As the number of sensors available to the agent grows, the computational cost of POMDP planning grows exponentially.
arXiv Detail & Related papers (2020-09-21T09:11:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.