Learning Temporally Extended Skills in Continuous Domains as Symbolic
Actions for Planning
- URL: http://arxiv.org/abs/2207.05018v3
- Date: Mon, 24 Jul 2023 13:46:46 GMT
- Title: Learning Temporally Extended Skills in Continuous Domains as Symbolic
Actions for Planning
- Authors: Jan Achterhold, Markus Krimmel, Joerg Stueckler
- Abstract summary: Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents.
We introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic abstraction of the environment's state for planning.
- Score: 2.642698101441705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Problems which require both long-horizon planning and continuous control
capabilities pose significant challenges to existing reinforcement learning
agents. In this paper we introduce a novel hierarchical reinforcement learning
agent which links temporally extended skills for continuous control with a
forward model in a symbolic discrete abstraction of the environment's state for
planning. We term our agent SEADS for Symbolic Effect-Aware Diverse Skills. We
formulate an objective and corresponding algorithm which leads to unsupervised
learning of a diverse set of skills through intrinsic motivation given a known
state abstraction. The skills are jointly learned with the symbolic forward
model which captures the effect of skill execution in the state abstraction.
After training, we can leverage the skills as symbolic actions using the
forward model for long-horizon planning and subsequently execute the plan using
the learned continuous-action control skills. The proposed algorithm learns
skills and forward models that can be used to solve complex tasks which require
both continuous control and long-horizon planning capabilities with high
success rate. It compares favorably with other flat and hierarchical
reinforcement learning baseline agents and is successfully demonstrated with a
real robot.
Related papers
- SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution [75.2573501625811]
Diffusion models have demonstrated strong potential for robotic trajectory planning.
generating coherent trajectories from high-level instructions remains challenging.
We propose SkillDiffuser, an end-to-end hierarchical planning framework.
arXiv Detail & Related papers (2023-12-18T18:16:52Z) - Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories.
We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning.
In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z) - Learning Goal-Conditioned Policies Offline with Self-Supervised Reward
Shaping [94.89128390954572]
We propose a novel self-supervised learning phase on the pre-collected dataset to understand the structure and the dynamics of the model.
We evaluate our method on three continuous control tasks, and show that our model significantly outperforms existing approaches.
arXiv Detail & Related papers (2023-01-05T15:07:10Z) - LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon
Manipulation [16.05029027561921]
Task and Motion Planning approaches excel at solving and generalizing across long-horizon tasks.
They assume predefined skill sets, which limits their real-world applications.
We propose an integrated task planning and skill learning framework named LEAGUE.
We show that the learned skills can be reused to accelerate learning in new tasks domains and transfer to a physical robot platform.
arXiv Detail & Related papers (2022-10-23T06:57:05Z) - STAP: Sequencing Task-Agnostic Policies [22.25415946972336]
We present Sequencing Task-Agnostic Policies (STAP) for training manipulation skills and coordinating their geometric dependencies at planning time to solve long-horizon tasks.
Our experiments indicate that this objective function approximates ground truth plan feasibility.
We demonstrate how STAP can be used for task and motion planning by estimating the geometric feasibility of skill sequences provided by a task planner.
arXiv Detail & Related papers (2022-10-21T21:09:37Z) - Latent Plans for Task-Agnostic Offline Reinforcement Learning [32.938030244921755]
We propose a novel hierarchical approach to learn task-agnostic long-horizon policies from high-dimensional camera observations.
We show that our formulation enables producing previously unseen combinations of skills to reach temporally extended goals by "stitching" together latent skills.
We even learn one multi-task visuomotor policy for 25 distinct manipulation tasks in the real world which outperforms both imitation learning and offline reinforcement learning techniques.
arXiv Detail & Related papers (2022-09-19T12:27:15Z) - Creativity of AI: Hierarchical Planning Model Learning for Facilitating
Deep Reinforcement Learning [19.470693909025798]
We introduce a novel deep reinforcement learning framework with symbolic options.
Our framework features a loop training procedure, which enables guiding the improvement of policy.
We conduct experiments on two domains, Montezuma's Revenge and Office World, respectively.
arXiv Detail & Related papers (2021-12-18T03:45:28Z) - Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon
Reasoning [120.38381203153159]
Reinforcement learning can train policies that effectively perform complex tasks.
For long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and composing lower-level skills.
We propose Value Function Spaces: a simple approach that produces such a representation by using the value functions corresponding to each lower-level skill.
arXiv Detail & Related papers (2021-11-04T22:46:16Z) - Example-Driven Model-Based Reinforcement Learning for Solving
Long-Horizon Visuomotor Tasks [85.56153200251713]
We introduce EMBR, a model-based RL method for learning primitive skills that are suitable for completing long-horizon visuomotor tasks.
On a Franka Emika robot arm, we find that EMBR enables the robot to complete three long-horizon visuomotor tasks at 85% success rate.
arXiv Detail & Related papers (2021-09-21T16:48:07Z) - Hierarchical Few-Shot Imitation with Skill Transition Models [66.81252581083199]
Few-shot Imitation with Skill Transition Models (FIST) is an algorithm that extracts skills from offline data and utilizes them to generalize to unseen tasks.
We show that FIST is capable of generalizing to new tasks and substantially outperforms prior baselines in navigation experiments.
arXiv Detail & Related papers (2021-07-19T15:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.