Constrained Hierarchical Monte Carlo Belief-State Planning
- URL: http://arxiv.org/abs/2310.20054v2
- Date: Mon, 26 Feb 2024 05:17:25 GMT
- Title: Constrained Hierarchical Monte Carlo Belief-State Planning
- Authors: Arec Jamgochian, Hugo Buurmeijer, Kyle H. Wray, Anthony Corso, Mykel
J. Kochenderfer
- Abstract summary: We introduce Constrained Options Belief Tree Search (COBeTS) to scale online search-based CPOMDP planning to large robotic problems.
If primitive option controllers are defined to satisfy assigned constraint budgets, COBeTS will satisfy constraints anytime.
We demonstrate COBeTS in several safety-critical, constrained partially observable robotic domains.
- Score: 35.606121916832144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optimal plans in Constrained Partially Observable Markov Decision Processes
(CPOMDPs) maximize reward objectives while satisfying hard cost constraints,
generalizing safe planning under state and transition uncertainty.
Unfortunately, online CPOMDP planning is extremely difficult in large or
continuous problem domains. In many large robotic domains, hierarchical
decomposition can simplify planning by using tools for low-level control given
high-level action primitives (options). We introduce Constrained Options Belief
Tree Search (COBeTS) to leverage this hierarchy and scale online search-based
CPOMDP planning to large robotic problems. We show that if primitive option
controllers are defined to satisfy assigned constraint budgets, then COBeTS
will satisfy constraints anytime. Otherwise, COBeTS will guide the search
towards a safe sequence of option primitives, and hierarchical monitoring can
be used to achieve runtime safety. We demonstrate COBeTS in several
safety-critical, constrained partially observable robotic domains, showing that
it can plan successfully in continuous CPOMDPs while non-hierarchical baselines
cannot.
Related papers
- Learning Logic Specifications for Policy Guidance in POMDPs: an
Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver.
We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications.
We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z) - Safe POMDP Online Planning via Shielding [6.234405592444883]
Partially observable Markov decision processes (POMDPs) have been widely used in many robotic applications for sequential decision-making under uncertainty.
POMDP online planning algorithms such as Partially Observable Monte-Carlo Planning (POMCP) can solve very large POMDPs with the goal of maximizing the expected return.
But the resulting policies cannot provide safety guarantees which are imperative for real-world safety-critical tasks.
arXiv Detail & Related papers (2023-09-19T00:02:05Z) - Lifted Sequential Planning with Lazy Constraint Generation Solvers [28.405198103927955]
This paper studies the possibilities made open by the use of Lazy Clause Generation (LCG) based approaches to Constraint Programming (CP)
We propose a novel CP model based on seminal ideas on so-called lifted causal encodings for planning as satisfiability.
We report that for planning problem instances requiring fewer plan steps our methods compare very well with the state-of-the-art in optimal sequential planning.
arXiv Detail & Related papers (2023-07-17T04:54:58Z) - Learning Logic Specifications for Soft Policy Guidance in POMCP [71.69251176275638]
Partially Observable Monte Carlo Planning (POMCP) is an efficient solver for Partially Observable Markov Decision Processes (POMDPs)
POMCP suffers from sparse reward function, namely, rewards achieved only when the final goal is reached.
In this paper, we use inductive logic programming to learn logic specifications from traces of POMCP executions.
arXiv Detail & Related papers (2023-03-16T09:37:10Z) - Continuous Monte Carlo Graph Search [61.11769232283621]
Continuous Monte Carlo Graph Search ( CMCGS) is an extension of Monte Carlo Tree Search (MCTS) to online planning.
CMCGS takes advantage of the insight that, during planning, sharing the same action policy between several states can yield high performance.
It can be scaled up through parallelization, and it outperforms the Cross-Entropy Method (CEM) in continuous control with learned dynamics models.
arXiv Detail & Related papers (2022-10-04T07:34:06Z) - Modular Deep Reinforcement Learning for Continuous Motion Planning with
Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP)
The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP.
The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z) - Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning [78.65083326918351]
We consider alternatives to an implicit sequential planning assumption.
We propose Divide-and-Conquer Monte Carlo Tree Search (DC-MCTS) for approximating the optimal plan.
We show that this algorithmic flexibility over planning order leads to improved results in navigation tasks in grid-worlds.
arXiv Detail & Related papers (2020-04-23T18:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.