Risk Aware Belief-dependent Constrained POMDP Planning
- URL: http://arxiv.org/abs/2209.02679v1
- Date: Tue, 6 Sep 2022 17:48:13 GMT
- Title: Risk Aware Belief-dependent Constrained POMDP Planning
- Authors: Andrey Zhitnikov, Vadim Indelman
- Abstract summary: Risk awareness is fundamental to an online operating agent.
Existing constrained POMDP algorithms are typically designed for discrete state and observation spaces.
This paper presents a novel formulation for risk-averse belief-dependent constrained POMDP.
- Score: 9.061408029414453
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Risk awareness is fundamental to an online operating agent. However, it
received less attention in the challenging continuous domain under partial
observability. Existing constrained POMDP algorithms are typically designed for
discrete state and observation spaces. In addition, current solvers for
constrained formulations do not support general belief-dependent constraints.
Crucially, in the POMDP setting, risk awareness in the context of a constraint
was addressed in a limited way. This paper presents a novel formulation for
risk-averse belief-dependent constrained POMDP. Our probabilistic constraint is
general and belief-dependent, as is the reward function. The proposed universal
framework applies to a continuous domain with nonparametric beliefs represented
by particles or parametric beliefs. We show that our formulation better
accounts for the risk than previous approaches.
Related papers
- Uniformly Safe RL with Objective Suppression for Multi-Constraint Safety-Critical Applications [73.58451824894568]
The widely adopted CMDP model constrains the risks in expectation, which makes room for dangerous behaviors in long-tail states.
In safety-critical domains, such behaviors could lead to disastrous outcomes.
We propose Objective Suppression, a novel method that adaptively suppresses the task reward maximizing objectives according to a safety critic.
arXiv Detail & Related papers (2024-02-23T23:22:06Z) - Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks [142.67349734180445]
Existing algorithms that provide risk-awareness to deep neural networks are complex and ad-hoc.
Here we present capsa, a framework for extending models with risk-awareness.
arXiv Detail & Related papers (2023-08-01T02:07:47Z) - Simplified Continuous High Dimensional Belief Space Planning with
Adaptive Probabilistic Belief-dependent Constraints [9.061408029414453]
Under uncertainty in partially observable domains, also known as Belief Space Planning, online decision making is a fundamental problem.
We present a technique to adaptively accept or discard a candidate action sequence with respect to a probabilistic belief-dependent constraint.
We apply our method to active SLAM, a highly challenging problem of high dimensional Belief Space Planning.
arXiv Detail & Related papers (2023-02-13T21:22:47Z) - Information-Theoretic Safe Exploration with Gaussian Processes [89.31922008981735]
We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an unknown (safety) constraint.
Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case.
We propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate.
arXiv Detail & Related papers (2022-12-09T15:23:58Z) - Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive
Privacy Analysis and Beyond [57.10914865054868]
We consider vertical logistic regression (VLR) trained with mini-batch descent gradient.
We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
arXiv Detail & Related papers (2022-07-19T05:47:30Z) - Non-Linear Spectral Dimensionality Reduction Under Uncertainty [107.01839211235583]
We propose a new dimensionality reduction framework, called NGEU, which leverages uncertainty information and directly extends several traditional approaches.
We show that the proposed NGEU formulation exhibits a global closed-form solution, and we analyze, based on the Rademacher complexity, how the underlying uncertainties theoretically affect the generalization ability of the framework.
arXiv Detail & Related papers (2022-02-09T19:01:33Z) - Risk-Averse Stochastic Shortest Path Planning [25.987787625028204]
We show that optimal, stationary, Markovian policies exist and can be found via a special Bellman's equation.
A rover navigation MDP is used to illustrate the proposed methodology with conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.
arXiv Detail & Related papers (2021-03-26T20:49:14Z) - Worst-Case Risk Quantification under Distributional Ambiguity using
Kernel Mean Embedding in Moment Problem [17.909696462645023]
We propose to quantify the worst-case risk under distributional ambiguity using the kernel mean embedding.
We numerically test the proposed method in characterizing the worst-case constraint violation probability in the context of a constrained control system.
arXiv Detail & Related papers (2020-03-31T23:51:27Z) - Cautious Reinforcement Learning via Distributional Risk in the Dual
Domain [45.17200683056563]
We study the estimation of risk-sensitive policies in reinforcement learning problems defined by a Markov Decision Process (MDPs) whose state and action spaces are countably finite.
We propose a new definition of risk, which we call caution, as a penalty function added to the dual objective of the linear programming (LP) formulation of reinforcement learning.
arXiv Detail & Related papers (2020-02-27T23:18:04Z) - Reinforcement Learning of Risk-Constrained Policies in Markov Decision
Processes [5.081241420920605]
Markov decision processes (MDPs) are the defacto frame-work for sequential decision making in the presence ofstochastic uncertainty.
We consider MDPswith discounted-sum payoff with failure states which repre-sent catastrophic outcomes.
Our maincontribution is an efficient risk-constrained planning algo-rithm that combines UCT-like search with a predictor learnedthrough interaction with the MDP.
arXiv Detail & Related papers (2020-02-27T13:36:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.