Leveraging Jumpy Models for Planning and Fast Learning in Robotic
Domains
- URL: http://arxiv.org/abs/2302.12617v1
- Date: Fri, 24 Feb 2023 13:26:03 GMT
- Title: Leveraging Jumpy Models for Planning and Fast Learning in Robotic
Domains
- Authors: Jingwei Zhang, Jost Tobias Springenberg, Arunkumar Byravan, Leonard
Hasenclever, Abbas Abdolmaleki, Dushyant Rao, Nicolas Heess, Martin
Riedmiller
- Abstract summary: We study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience.
We propose to learn a jumpy model alongside a skill embedding space offline, from previously collected experience.
We conduct a set of experiments in the RGB-stacking environment, showing that planning with the learned skills and the associated model can enable zero-shot generalization to new tasks.
- Score: 25.245208731491346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we study the problem of learning multi-step dynamics prediction
models (jumpy models) from unlabeled experience and their utility for fast
inference of (high-level) plans in downstream tasks. In particular we propose
to learn a jumpy model alongside a skill embedding space offline, from
previously collected experience for which no labels or reward annotations are
required. We then investigate several options of harnessing those learned
components in combination with model-based planning or model-free reinforcement
learning (RL) to speed up learning on downstream tasks. We conduct a set of
experiments in the RGB-stacking environment, showing that planning with the
learned skills and the associated model can enable zero-shot generalization to
new tasks, and can further speed up training of policies via reinforcement
learning. These experiments demonstrate that jumpy models which incorporate
temporal abstraction can facilitate planning in long-horizon tasks in which
standard dynamics models fail.
Related papers
- Transfer Learning with Foundational Models for Time Series Forecasting using Low-Rank Adaptations [0.0]
This study proposes LLIAM, the Llama Lora-Integrated Autorregresive Model.
Low-Rank Adaptations are used to enhance the knowledge of the model with diverse time series datasets, known as the fine-tuning phase.
arXiv Detail & Related papers (2024-10-15T12:14:01Z) - Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning [113.89327264634984]
Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples.
Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially.
We propose a dual selective SSM projector that dynamically adjusts the projection parameters based on the intermediate features for dynamic adaptation.
arXiv Detail & Related papers (2024-07-08T17:09:39Z) - ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model
Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend.
ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z) - Self-Supervised Reinforcement Learning that Transfers using Random
Features [41.00256493388967]
We propose a self-supervised reinforcement learning method that enables the transfer of behaviors across tasks with different rewards.
Our method is self-supervised in that it can be trained on offline datasets without reward labels, but can then be quickly deployed on new tasks.
arXiv Detail & Related papers (2023-05-26T20:37:06Z) - Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories.
We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning.
In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z) - Skill-based Model-based Reinforcement Learning [18.758245582997656]
Model-based reinforcement learning (RL) is a sample-efficient way of learning complex behaviors.
We propose a Skill-based Model-based RL framework (SkiMo) that enables planning in the skill space.
We harness the learned skill dynamics model to accurately simulate and plan over long horizons in the skill space.
arXiv Detail & Related papers (2022-07-15T16:06:33Z) - Few-shot Prompting Towards Controllable Response Generation [49.479958672988566]
We first explored the combination of prompting and reinforcement learning (RL) to steer models' generation without accessing any of the models' parameters.
We apply multi-task learning to make the model learn to generalize to new tasks better.
Experiment results show that our proposed method can successfully control several state-of-the-art (SOTA) dialogue models without accessing their parameters.
arXiv Detail & Related papers (2022-06-08T14:48:06Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - Learning Dynamics Models for Model Predictive Agents [28.063080817465934]
Model-Based Reinforcement Learning involves learning a textitdynamics model from data, and then using this model to optimise behaviour.
This paper sets out to disambiguate the role of different design choices for learning dynamics models, by comparing their performance to planning with a ground-truth model.
arXiv Detail & Related papers (2021-09-29T09:50:25Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.