SPOTTER: Extending Symbolic Planning Operators through Targeted
Reinforcement Learning
- URL: http://arxiv.org/abs/2012.13037v1
- Date: Thu, 24 Dec 2020 00:31:02 GMT
- Title: SPOTTER: Extending Symbolic Planning Operators through Targeted
Reinforcement Learning
- Authors: Vasanth Sarathy, Daniel Kasenberg, Shivam Goel, Jivko Sinapov,
Matthias Scheutz
- Abstract summary: Symbolic planning models allow decision-making agents to sequence actions in arbitrary ways to achieve a variety of goals in dynamic domains.
Reinforcement learning approaches do not require such models, and instead learn domain dynamics by exploring the environment and collecting rewards.
We propose an integrated framework named SPOTTER that uses RL to augment and support ("spot") a planning agent by discovering new operators needed to accomplish goals that are initially unreachable for the agent.
- Score: 24.663586662594703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Symbolic planning models allow decision-making agents to sequence actions in
arbitrary ways to achieve a variety of goals in dynamic domains. However, they
are typically handcrafted and tend to require precise formulations that are not
robust to human error. Reinforcement learning (RL) approaches do not require
such models, and instead learn domain dynamics by exploring the environment and
collecting rewards. However, RL approaches tend to require millions of episodes
of experience and often learn policies that are not easily transferable to
other tasks. In this paper, we address one aspect of the open problem of
integrating these approaches: how can decision-making agents resolve
discrepancies in their symbolic planning models while attempting to accomplish
goals? We propose an integrated framework named SPOTTER that uses RL to augment
and support ("spot") a planning agent by discovering new operators needed by
the agent to accomplish goals that are initially unreachable for the agent.
SPOTTER outperforms pure-RL approaches while also discovering transferable
symbolic knowledge and does not require supervision, successful plan traces or
any a priori knowledge about the missing planning operator.
Related papers
- Diffusion-Reinforcement Learning Hierarchical Motion Planning in Adversarial Multi-agent Games [6.532258098619471]
We focus on a motion planning task for an evasive target in a partially observable multi-agent adversarial pursuit-evasion games (PEG)
These pursuit-evasion problems are relevant to various applications, such as search and rescue operations and surveillance robots.
We propose a hierarchical architecture that integrates a high-level diffusion model to plan global paths responsive to environment data.
arXiv Detail & Related papers (2024-03-16T03:53:55Z) - KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents [54.09074527006576]
Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges.
This inadequacy primarily stems from the lack of built-in action knowledge in language agents.
We introduce KnowAgent, a novel approach designed to enhance the planning capabilities of LLMs by incorporating explicit action knowledge.
arXiv Detail & Related papers (2024-03-05T16:39:12Z) - Foundation Policies with Hilbert Representations [54.44869979017766]
We propose an unsupervised framework to pre-train generalist policies from unlabeled offline data.
Our key insight is to learn a structured representation that preserves the temporal structure of the underlying environment.
Our experiments show that our unsupervised policies can solve goal-conditioned and general RL tasks in a zero-shot fashion.
arXiv Detail & Related papers (2024-02-23T19:09:10Z) - Goal-Conditioned Reinforcement Learning with Disentanglement-based
Reachability Planning [14.370384505230597]
We propose a goal-conditioned RL algorithm combined with Disentanglement-based Reachability Planning (REPlan) to solve temporally extended tasks.
Our REPlan significantly outperforms the prior state-of-the-art methods in solving temporally extended tasks.
arXiv Detail & Related papers (2023-07-20T13:08:14Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Leveraging Approximate Symbolic Models for Reinforcement Learning via
Skill Diversity [32.35693772984721]
We introduce Symbolic-Model Guided Reinforcement Learning, wherein we will formalize the relationship between the symbolic model and the underlying MDP.
We will use these models to extract high-level landmarks that will be used to decompose the task.
At the low level, we learn a set of diverse policies for each possible task sub-goal identified by the landmark.
arXiv Detail & Related papers (2022-02-06T23:20:30Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals [14.315501760755609]
PlanGAN is a model-based algorithm for solving multi-goal tasks in environments with sparse rewards.
Our studies indicate that PlanGAN can achieve comparable performance whilst being around 4-8 times more sample efficient.
arXiv Detail & Related papers (2020-06-01T12:53:09Z) - Model-based Reinforcement Learning for Decentralized Multiagent
Rendezvous [66.6895109554163]
Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans.
We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
arXiv Detail & Related papers (2020-03-15T19:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.