Skill-based Model-based Reinforcement Learning
- URL: http://arxiv.org/abs/2207.07560v1
- Date: Fri, 15 Jul 2022 16:06:33 GMT
- Title: Skill-based Model-based Reinforcement Learning
- Authors: Lucy Xiaoyang Shi and Joseph J. Lim and Youngwoon Lee
- Abstract summary: Model-based reinforcement learning (RL) is a sample-efficient way of learning complex behaviors.
We propose a Skill-based Model-based RL framework (SkiMo) that enables planning in the skill space.
We harness the learned skill dynamics model to accurately simulate and plan over long horizons in the skill space.
- Score: 18.758245582997656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Model-based reinforcement learning (RL) is a sample-efficient way of learning
complex behaviors by leveraging a learned single-step dynamics model to plan
actions in imagination. However, planning every action for long-horizon tasks
is not practical, akin to a human planning out every muscle movement. Instead,
humans efficiently plan with high-level skills to solve complex tasks. From
this intuition, we propose a Skill-based Model-based RL framework (SkiMo) that
enables planning in the skill space using a skill dynamics model, which
directly predicts the skill outcomes, rather than predicting all small details
in the intermediate states, step by step. For accurate and efficient long-term
planning, we jointly learn the skill dynamics model and a skill repertoire from
prior experience. We then harness the learned skill dynamics model to
accurately simulate and plan over long horizons in the skill space, which
enables efficient downstream learning of long-horizon, sparse reward tasks.
Experimental results in navigation and manipulation domains show that SkiMo
extends the temporal horizon of model-based approaches and improves the sample
efficiency for both model-based RL and skill-based RL. Code and videos are
available at \url{https://clvrai.com/skimo}
Related papers
- PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Simplified Temporal Consistency Reinforcement Learning [19.814047499837084]
We show that a simple representation learning approach relying on a latent dynamics model trained by latent temporal consistency is sufficient for high-performance RL.
Our approach outperforms model-free methods by a large margin and matches model-based methods' sample efficiency while training 2.4 times faster.
arXiv Detail & Related papers (2023-06-15T19:37:43Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Leveraging Jumpy Models for Planning and Fast Learning in Robotic
Domains [25.245208731491346]
We study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience.
We propose to learn a jumpy model alongside a skill embedding space offline, from previously collected experience.
We conduct a set of experiments in the RGB-stacking environment, showing that planning with the learned skills and the associated model can enable zero-shot generalization to new tasks.
arXiv Detail & Related papers (2023-02-24T13:26:03Z) - ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically
Simulated Characters [123.88692739360457]
General-purpose motor skills enable humans to perform complex tasks.
These skills also provide powerful priors for guiding their behaviors when learning new tasks.
We present a framework for learning versatile and reusable skill embeddings for physically simulated characters.
arXiv Detail & Related papers (2022-05-04T06:13:28Z) - Temporal Difference Learning for Model Predictive Control [29.217382374051347]
Data-driven model predictive control has two key advantages over model-free methods.
TD-MPC achieves superior sample efficiency and performance over prior work on both state and image-based continuous control tasks.
arXiv Detail & Related papers (2022-03-09T18:58:28Z) - Learning to Execute: Efficient Learning of Universal Plan-Conditioned
Policies in Robotics [20.148408520475655]
We introduce Learning to Execute (L2E), which leverages information contained in approximate plans to learn universal policies that are conditioned on plans.
In our robotic manipulation experiments, L2E exhibits increased performance when compared to pure RL, pure planning, or baseline methods combining learning and planning.
arXiv Detail & Related papers (2021-11-15T16:58:50Z) - Predictive Control Using Learned State Space Models via Rolling Horizon
Evolution [2.1016374925364616]
In this paper, we explore this theme combining evolutionary algorithmic planning techniques with models learned via deep learning and variational inference.
We demonstrate the approach with an agent that reliably performs online planning in a set of visual navigation tasks.
arXiv Detail & Related papers (2021-06-25T23:23:42Z) - Model-Based Reinforcement Learning via Latent-Space Collocation [110.04005442935828]
We argue that it is easier to solve long-horizon tasks by planning sequences of states rather than just actions.
We adapt the idea of collocation, which has shown good results on long-horizon tasks in optimal control literature, to the image-based setting by utilizing learned latent state space models.
arXiv Detail & Related papers (2021-06-24T17:59:18Z) - Bridging Imagination and Reality for Model-Based Deep Reinforcement
Learning [72.18725551199842]
We propose a novel model-based reinforcement learning algorithm, called BrIdging Reality and Dream (BIRD)
It maximizes the mutual information between imaginary and real trajectories so that the policy improvement learned from imaginary trajectories can be easily generalized to real trajectories.
We demonstrate that our approach improves sample efficiency of model-based planning, and achieves state-of-the-art performance on challenging visual control benchmarks.
arXiv Detail & Related papers (2020-10-23T03:22:01Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.