Compactly Restrictable Metric Policy Optimization Problems
- URL: http://arxiv.org/abs/2207.05850v1
- Date: Tue, 12 Jul 2022 21:27:59 GMT
- Title: Compactly Restrictable Metric Policy Optimization Problems
- Authors: Victor D. Dorobantu, Kamyar Azizzadenesheli, and Yisong Yue
- Abstract summary: We study policy optimization problems for deterministic Markov decision processes with metric state and action spaces.
Our goal is to establish theoretical results on the well-posedness of MPOPs that can characterize practically relevant continuous control systems.
- Score: 34.3498583619248
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study policy optimization problems for deterministic Markov decision
processes (MDPs) with metric state and action spaces, which we refer to as
Metric Policy Optimization Problems (MPOPs). Our goal is to establish
theoretical results on the well-posedness of MPOPs that can characterize
practically relevant continuous control systems. To do so, we define a special
class of MPOPs called Compactly Restrictable MPOPs (CR-MPOPs), which are
flexible enough to capture the complex behavior of robotic systems but specific
enough to admit solutions using dynamic programming methods such as value
iteration. We show how to arrive at CR-MPOPs using forward-invariance. We
further show that our theoretical results on CR-MPOPs can be used to
characterize feedback linearizable control affine systems.
Related papers
- Recursively-Constrained Partially Observable Markov Decision Processes [13.8724466775267]
We show that C-POMDPs violate the optimal substructure property over successive decision steps.
Online re-planning in C-POMDPs is often ineffective due to the inconsistency resulting from this violation.
We introduce the Recursively-Constrained POMDP, which imposes additional history-dependent cost constraints on the C-POMDP.
arXiv Detail & Related papers (2023-10-15T00:25:07Z) - Formal Controller Synthesis for Markov Jump Linear Systems with
Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems.
Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS.
We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z) - Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm.
We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z) - Dynamic Regret of Online Markov Decision Processes [84.20723936192945]
We investigate online Markov Decision Processes (MDPs) with adversarially changing loss functions and known transitions.
We choose dynamic regret as the performance measure, defined as the performance difference between the learner and any sequence of feasible changing policies.
We consider three foundational models of online MDPs, including episodic loop-free Shortest Path (SSP), episodic SSP, and infinite-horizon MDPs.
arXiv Detail & Related papers (2022-08-26T07:42:53Z) - Robust Entropy-regularized Markov Decision Processes [23.719568076996662]
We study a robust version of the ER-MDP model, where the optimal policies are required to be robust.
We show that essential properties that hold for the non-robust ER-MDP and robust unregularized MDP models also hold in our settings.
We show how our framework and results can be integrated into different algorithmic schemes including value or (modified) policy.
arXiv Detail & Related papers (2021-12-31T09:50:46Z) - Risk-Averse Decision Making Under Uncertainty [18.467950783426947]
A large class of decision making under uncertainty problems can be described via Markov decision processes (MDPs) or partially observable MDPs (POMDPs)
In this paper, we consider the problem of designing policies for MDPs and POMDPs with objectives and constraints in terms of dynamic coherent risk measures.
arXiv Detail & Related papers (2021-09-09T07:52:35Z) - Identification of Unexpected Decisions in Partially Observable
Monte-Carlo Planning: a Rule-Based Approach [78.05638156687343]
We propose a methodology for analyzing POMCP policies by inspecting their traces.
The proposed method explores local properties of policy behavior to identify unexpected decisions.
We evaluate our approach on Tiger, a standard benchmark for POMDPs, and a real-world problem related to mobile robot navigation.
arXiv Detail & Related papers (2020-12-23T15:09:28Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z) - Stochastic Finite State Control of POMDPs with LTL Specifications [14.163899014007647]
Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty.
This paper considers the quantitative problem of synthesizing sub-optimal finite state controllers (sFSCs) for POMDPs.
We propose a bounded policy algorithm, leading to a controlled growth in sFSC size and an any time algorithm, where the performance of the controller improves with successive iterations.
arXiv Detail & Related papers (2020-01-21T18:10:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.