Safe Learning of Lifted Action Models
- URL: http://arxiv.org/abs/2107.04169v1
- Date: Fri, 9 Jul 2021 01:24:01 GMT
- Title: Safe Learning of Lifted Action Models
- Authors: Brendan Juba, Hai S. Le, Roni Stern
- Abstract summary: We propose an algorithm for solving the model-free planning problem in classical planning.
The number of trajectories needed to solve future problems with high probability is linear in the potential size of the domain model.
- Score: 46.65973550325976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Creating a domain model, even for classical, domain-independent planning, is
a notoriously hard knowledge-engineering task. A natural approach to solve this
problem is to learn a domain model from observations. However, model learning
approaches frequently do not provide safety guarantees: the learned model may
assume actions are applicable when they are not, and may incorrectly capture
actions' effects. This may result in generating plans that will fail when
executed. In some domains such failures are not acceptable, due to the cost of
failure or inability to replan online after failure. In such settings, all
learning must be done offline, based on some observations collected, e.g., by
some other agents or a human. Through this learning, the task is to generate a
plan that is guaranteed to be successful. This is called the model-free
planning problem. Prior work proposed an algorithm for solving the model-free
planning problem in classical planning. However, they were limited to learning
grounded domains, and thus they could not scale. We generalize this prior work
and propose the first safe model-free planning algorithm for lifted domains. We
prove the correctness of our approach, and provide a statistical analysis
showing that the number of trajectories needed to solve future problems with
high probability is linear in the potential size of the domain model. We also
present experiments on twelve IPC domains showing that our approach is able to
learn the real action model in all cases with at most two trajectories.
Related papers
- Safe Learning of PDDL Domains with Conditional Effects -- Extended Version [27.05167679870857]
We show that Conditional-SAM can be used to solve perfectly most of the test set problems in most of the experimented domains.
Our results show that the action models learned by Conditional-SAM can be used to solve perfectly most of the test set problems.
arXiv Detail & Related papers (2024-03-22T14:49:49Z) - Position Paper: Online Modeling for Offline Planning [2.8326418377665346]
A key part of AI planning research is the representation of action models.
Despite the maturity of the field, AI planning technology is still rarely used outside the research community.
We argue that this is because the modeling process is assumed to have taken place and completed prior to the planning process.
arXiv Detail & Related papers (2022-06-07T14:48:08Z) - Goal-Space Planning with Subgoal Models [18.43265820052893]
This paper investigates a new approach to model-based reinforcement learning using background planning.
We show that our GSP algorithm can propagate value from an abstract space in a manner that helps a variety of base learners learn significantly faster in different domains.
arXiv Detail & Related papers (2022-06-06T20:59:07Z) - SAGE: Generating Symbolic Goals for Myopic Models in Deep Reinforcement
Learning [18.37286885057802]
We propose an algorithm combining learning and planning to exploit a previously unusable class of incomplete models.
This combines the strengths of symbolic planning and neural learning approaches in a novel way that outperforms competing methods on variations of taxi world and Minecraft.
arXiv Detail & Related papers (2022-03-09T22:55:53Z) - Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models.
Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning.
Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - Sufficiently Accurate Model Learning for Planning [119.80502738709937]
This paper introduces the constrained Sufficiently Accurate model learning approach.
It provides examples of such problems, and presents a theorem on how close some approximate solutions can be.
The approximate solution quality will depend on the function parameterization, loss and constraint function smoothness, and the number of samples in model learning.
arXiv Detail & Related papers (2021-02-11T16:27:31Z) - Domain Concretization from Examples: Addressing Missing Domain Knowledge
via Robust Planning [5.051046322526032]
In this work, we formulate it as the problem of Domain Concretization, an inverse problem to domain abstraction.
Based on an incomplete domain model provided by the designer and teacher traces from human users, our algorithm searches for a candidate model set under a minimalistic model assumption.
It then generates a robust plan with the maximum probability of success under the set of candidate models.
arXiv Detail & Related papers (2020-11-18T01:56:15Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - STRIPS Action Discovery [67.73368413278631]
Recent approaches have shown the success of classical planning at synthesizing action models even when all intermediate states are missing.
We propose a new algorithm to unsupervisedly synthesize STRIPS action models with a classical planner when action signatures are unknown.
arXiv Detail & Related papers (2020-01-30T17:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.