Robust Finite-State Controllers for Uncertain POMDPs
- URL: http://arxiv.org/abs/2009.11459v2
- Date: Thu, 4 Mar 2021 21:00:38 GMT
- Title: Robust Finite-State Controllers for Uncertain POMDPs
- Authors: Murat Cubuktepe, Nils Jansen, Sebastian Junges, Ahmadreza Marandi,
Marnix Suilen, Ufuk Topcu
- Abstract summary: Uncertain partially observable decision processes (uPOMDPs) allow the probabilistic transition observation functions of standard POMDPs to belong to an uncertainty set.
We develop an algorithm to compute finite-memory policies for uPOMDPs.
- Score: 25.377873201375515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Uncertain partially observable Markov decision processes (uPOMDPs) allow the
probabilistic transition and observation functions of standard POMDPs to belong
to a so-called uncertainty set. Such uncertainty, referred to as epistemic
uncertainty, captures uncountable sets of probability distributions caused by,
for instance, a lack of data available. We develop an algorithm to compute
finite-memory policies for uPOMDPs that robustly satisfy specifications against
any admissible distribution. In general, computing such policies is
theoretically and practically intractable. We provide an efficient solution to
this problem in four steps. (1) We state the underlying problem as a nonconvex
optimization problem with infinitely many constraints. (2) A dedicated
dualization scheme yields a dual problem that is still nonconvex but has
finitely many constraints. (3) We linearize this dual problem and (4) solve the
resulting finite linear program to obtain locally optimal solutions to the
original problem. The resulting problem formulation is exponentially smaller
than those resulting from existing methods. We demonstrate the applicability of
our algorithm using large instances of an aircraft collision-avoidance scenario
and a novel spacecraft motion planning case study.
Related papers
- Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains.
Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint.
This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions.
The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Uncertainty relation and the constrained quadratic programming [3.397254718930225]
We find that the tight state-independent lower bound of the variance sum can be characterized as a quadratic programming problem with nonlinear constraints in optimization theory.
We introduce a numerical algorithm tailored for solving these quadratic programming instances, highlighting its efficiency and accuracy.
arXiv Detail & Related papers (2024-04-29T13:11:20Z) - Learning Adversarial MDPs with Stochastic Hard Constraints [37.24692425018]
We study online learning problems in constrained decision processes with adversarial losses and hard constraints.
We design an algorithm that achieves sublinear regret while ensuring that the constraints are satisfied at every episode with high probability.
arXiv Detail & Related papers (2024-03-06T12:49:08Z) - Double Duality: Variational Primal-Dual Policy Optimization for
Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure.
Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z) - Online POMDP Planning with Anytime Deterministic Guarantees [11.157761902108692]
Planning under uncertainty can be mathematically formalized using partially observable Markov decision processes (POMDPs)
Finding an optimal plan for POMDPs can be computationally expensive and is feasible only for small tasks.
We derive a deterministic relationship between a simplified solution that is easier to obtain and the theoretically optimal one.
arXiv Detail & Related papers (2023-10-03T04:40:38Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Distributionally Robust Optimization with Markovian Data [8.126833795693699]
We study a program where the probability distribution of the uncertain problem parameters is unknown.
We propose a data-driven distributionally to estimate the problem's objective function and optimal solution.
arXiv Detail & Related papers (2021-06-12T10:59:02Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - The limits of min-max optimization algorithms: convergence to spurious
non-critical sets [82.74514886461257]
min-max optimization algorithms encounter problems far greater because of the existence of periodic cycles and similar phenomena.
We show that there exist algorithms that do not attract any points of the problem.
We illustrate such challenges in simple to almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost
arXiv Detail & Related papers (2020-06-16T10:49:27Z) - Consistent Second-Order Conic Integer Programming for Learning Bayesian
Networks [2.7473982588529653]
We study the problem of learning the sparse DAG structure of a BN from continuous observational data.
The optimal solution to this mathematical program is known to have desirable statistical properties under certain conditions.
We propose a concrete early stopping criterion to terminate the branch-and-bound process in order to obtain a near-optimal solution.
arXiv Detail & Related papers (2020-05-29T00:13:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.