Related papers: Expressing and Exploiting the Common Subgoal Structure of Classical Planning Domains Using Sketches: Extended Version

Expressing and Exploiting the Common Subgoal Structure of Classical Planning Domains Using Sketches: Extended Version

URL: http://arxiv.org/abs/2105.04250v1
Date: Mon, 10 May 2021 10:36:18 GMT
Title: Expressing and Exploiting the Common Subgoal Structure of Classical Planning Domains Using Sketches: Extended Version
Authors: Dominik Drexler and Jendrik Seipp and Hector Geffner
Abstract summary: We use a simple but powerful language for expressing problem decompositions introduced recently by Bonet and Geffner, called policy sketches. A policy sketch R consists of a set of Boolean and numerical features and a set of sketch rules that express how the values of these features are supposed to change. We show that many planning domains that cannot be solved by SIW are provably solvable in low time with the SIW_R algorithm.
Score: 17.63517562327928
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Width-based planning methods exploit the use of conjunctive goals for decomposing problems into subproblems of low width. However, algorithms like SIW fail when the goal is not serializable. In this work, we address this limitation of SIW by using a simple but powerful language for expressing problem decompositions introduced recently by Bonet and Geffner, called policy sketches. A policy sketch R consists of a set of Boolean and numerical features and a set of sketch rules that express how the values of these features are supposed to change. Like general policies, policy sketches are domain general, but unlike policies, the changes captured by sketch rules do not need to be achieved in a single step. We show that many planning domains that cannot be solved by SIW are provably solvable in low polynomial time with the SIW_R algorithm, the version of SIW that employs user-provided policy sketches. Policy sketches are thus shown to be a powerful language for expressing domain-specific knowledge in a simple and compact way and a convenient alternative to languages such as HTNs or temporal logics. Furthermore, policy sketches make it easy to express general problem decompositions and prove key properties like their complexity and width.

Related papers

Learning Sketch Decompositions in Planning via Deep Reinforcement Learning [10.52014836549529]
In planning and reinforcement learning, the identification of common subgoal structures across problems is important. These sketches split problems into subproblems which then become solvable in low time by a greedy sequence of IW$(k)$ searches.
arXiv Detail & Related papers (2024-12-11T17:45:31Z)
Intertwining CP and NLP: The Generation of Unreasonably Constrained Sentences [49.86129209397701]
An approach for generating constrained sentences in CP has been proposed in (Bonlarron et al, 2023) This paper introduces a novel more generic approach to tackle many of these previously untractable problems. Thanks to CP-based approach, strongly constrained sentences have been successfully generated.
arXiv Detail & Related papers (2024-06-15T17:40:49Z)
General Policies, Subgoal Structure, and Planning Width [19.373790756767278]
It has been observed that planning domains with atomic goals can be solved by means of a simple exploration procedure, called IW, that runs in time exponential in the problem width. Yet, there is no good explanation for why so many benchmark domains have bounded width when atomic goals are considered.
arXiv Detail & Related papers (2023-11-09T16:30:22Z)
Dimensionless Policies based on the Buckingham $\pi$ Theorem: Is This a Good Way to Generalize Numerical Results? [66.52698983694613]
This article explores the use of the Buckingham $pi$ theorem as a tool to encode the control policies of physical systems into a generic form of knowledge. We show, by restating the solution to a motion control problem using dimensionless variables, that (1) the policy mapping involves a reduced number of parameters and (2) control policies generated numerically for a specific system can be transferred exactly to a subset of dimensionally similar systems by scaling the input and output variables appropriately. It remains to be seen how practical this approach can be to generalize policies for more complex high-dimensional problems, but the early results show that it is a
arXiv Detail & Related papers (2023-07-29T00:51:26Z)
Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm. We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z)
Efficient Policy Iteration for Robust Markov Decision Processes via Regularization [49.05403412954533]
Robust decision processes (MDPs) provide a framework to model decision problems where the system dynamics are changing or only partially known. Recent work established the equivalence between texttts rectangular $L_p$ robust MDPs and regularized MDPs, and derived a regularized policy iteration scheme that enjoys the same level of efficiency as standard MDPs. In this work, we focus on the policy improvement step and derive concrete forms for the greedy policy and the optimal robust Bellman operators.
arXiv Detail & Related papers (2022-05-28T04:05:20Z)
Preliminary Results on Using Abstract AND-OR Graphs for Generalized Solving of Stochastic Shortest Path Problems [25.152899734616298]
Shortest Path Problems (SSPs) are goal-oriented problems in the real-world. A key difficulty for computing solutions for SSPs is finding solutions to even moderately sized problems intractable. We show that our approach can be embedded in any SSP solver to compute hierarchically optimal policies.
arXiv Detail & Related papers (2022-04-08T21:30:47Z)
Learning Sketches for Decomposing Planning Problems into Subproblems of Bounded Width: Extended Version [18.95007906887466]
Sketches have been introduced as a general language for representing the subgoal structure of instances drawn from the same domain. We present a problem of learning sketches automatically given a planning domain, some instances of the target class of problems, and the desired bound on the sketch width. The sketch learner and SIW_R planner yield a domain-independent planner that learns and exploits domain structure in a crisp and explicit form.
arXiv Detail & Related papers (2022-03-28T15:49:08Z)
Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval [66.37346493506737]
Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a novel cross-modal retrieval task. We propose a novel Domain-Smoothing Network (DSN) for ZS-SBIR. Our approach notably outperforms the state-of-the-art methods in both Sketchy and TU-Berlin datasets.
arXiv Detail & Related papers (2021-06-22T14:58:08Z)
Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z)
Optimization Issues in KL-Constrained Approximate Policy Iteration [48.24321346619156]
Many reinforcement learning algorithms can be seen as versions of approximate policy iteration (API) While standard API often performs poorly, it has been shown that learning can be stabilized by regularizing each policy update by the KL-divergence to the previous policy. Popular practical algorithms such as TRPO, MPO, and VMPO replace regularization by a constraint on KL-divergence of consecutive policies.
arXiv Detail & Related papers (2021-02-11T19:35:33Z)
Learning General Policies from Small Examples Without Supervision [18.718037284357834]
Generalized planning is concerned with the computation of general policies that solve multiple instances of a planning domain all at once. It has been recently shown that these policies can be computed in two steps: first, a suitable abstraction in the form of a qualitative numerical planning problem (QNP) is learned from sample plans. In this work, we introduce an alternative approach for computing more expressive general policies which does not require sample plans or a QNP planner.
arXiv Detail & Related papers (2021-01-03T19:44:13Z)
General Policies, Serializations, and Planning Width [22.112881443209726]
We show that bounded width is a property of planning domains that admit optimal general policies in terms of features that are explicitly or implicitly represented in the domain encoding. The study leads also to a new simple, meaningful, and expressive language for specifying domain serializations in the form of policy sketches.
arXiv Detail & Related papers (2020-12-15T01:33:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.