Sample Efficient Robot Learning with Structured World Models
- URL: http://arxiv.org/abs/2210.12278v1
- Date: Fri, 21 Oct 2022 22:08:55 GMT
- Title: Sample Efficient Robot Learning with Structured World Models
- Authors: Tuluhan Akbulut, Max Merlin, Shane Parr, Benedict Quartey, Skye
Thompson
- Abstract summary: In game environments, the use of world models has been shown to improve sample efficiency while still achieving good performance.
We compare the use of RGB image observation with a feature space leveraging built-in structure, a common approach in robot skill learning, and compare the impact on task performance and learning efficiency with and without the world model.
- Score: 3.1761323820497656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning has been demonstrated as a flexible and effective
approach for learning a range of continuous control tasks, such as those used
by robots to manipulate objects in their environment. But in robotics
particularly, real-world rollouts are costly, and sample efficiency can be a
major limiting factor when learning a new skill. In game environments, the use
of world models has been shown to improve sample efficiency while still
achieving good performance, especially when images or other rich observations
are provided. In this project, we explore the use of a world model in a
deformable robotic manipulation task, evaluating its effect on sample
efficiency when learning to fold a cloth in simulation. We compare the use of
RGB image observation with a feature space leveraging built-in structure
(keypoints representing the cloth configuration), a common approach in robot
skill learning, and compare the impact on task performance and learning
efficiency with and without the world model. Our experiments showed that the
usage of keypoints increased the performance of the best model on the task by
50%, and in general, the use of a learned or constructed reduced feature space
improved task performance and sample efficiency. The use of a state transition
predictor(MDN-RNN) in our world models did not have a notable effect on task
performance.
Related papers
- ConditionNET: Learning Preconditions and Effects for Execution Monitoring [9.64001633229156]
ConditionNET is an approach for learning the preconditions and effects of actions in a fully data-driven manner.
We show in experiments that ConditionNET outperforms all baselines on both anomaly detection and phase prediction tasks.
Our results highlight the potential of ConditionNET for enhancing the reliability and adaptability of robots in real-world environments.
arXiv Detail & Related papers (2025-02-03T09:00:45Z) - Sample Efficient Robot Learning in Supervised Effect Prediction Tasks [0.0]
In this work, we develop a novel AL framework geared towards robotics regression tasks, such as action-effect prediction and, more generally, for world model learning, which we call MUSEL.
MUSEL aims to extract model uncertainty from the total uncertainty estimate given by a suitable learning engine by making use of earning progress and input diversity and use it to improve sample efficiency beyond the state-of-the-art action-effect prediction methods.
The efficacy of MUSEL is demonstrated by comparing its performance to standard methods used in robot action-effect learning.
arXiv Detail & Related papers (2024-12-03T09:48:28Z) - Keypoint Abstraction using Large Models for Object-Relative Imitation Learning [78.92043196054071]
Generalization to novel object configurations and instances across diverse tasks and environments is a critical challenge in robotics.
Keypoint-based representations have been proven effective as a succinct representation for essential object capturing features.
We propose KALM, a framework that leverages large pre-trained vision-language models to automatically generate task-relevant and cross-instance consistent keypoints.
arXiv Detail & Related papers (2024-10-30T17:37:31Z) - DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning [75.68193159293425]
In-context learning (ICL) allows transformer-based language models to learn a specific task with a few "task demonstrations" without updating their parameters.
We propose an influence function-based attribution technique, DETAIL, that addresses the specific characteristics of ICL.
We experimentally prove the wide applicability of DETAIL by showing our attribution scores obtained on white-box models are transferable to black-box models in improving model performance.
arXiv Detail & Related papers (2024-05-22T15:52:52Z) - Unsupervised Learning of Effective Actions in Robotics [0.9374652839580183]
Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's actions.
We propose an unsupervised algorithm to discretize a continuous motion space and generate "action prototypes"
We evaluate our method on a simulated stair-climbing reinforcement learning task.
arXiv Detail & Related papers (2024-04-03T13:28:52Z) - Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation [8.940998315746684]
We propose a model-based reinforcement learning (RL) approach for robotic arm end-tasks.
We employ Bayesian neural network models to represent, in a probabilistic way, both the belief and information encoded in the dynamic model during exploration.
Our experiments show the advantages of our Bayesian model-based RL approach, with similar quality in the results than relevant alternatives.
arXiv Detail & Related papers (2024-04-02T11:44:37Z) - TWIST: Teacher-Student World Model Distillation for Efficient
Sim-to-Real Transfer [23.12048336150798]
This paper proposes TWIST (Teacher-Student World Model Distillation for Sim-to-Real Transfer) to achieve efficient sim-to-real transfer of vision-based model-based RL.
Specifically, TWIST leverages state observations as readily accessible, privileged information commonly garnered from a simulator to significantly accelerate sim-to-real transfer.
arXiv Detail & Related papers (2023-11-07T00:18:07Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - Dynamic-Resolution Model Learning for Object Pile Manipulation [33.05246884209322]
We investigate how to learn dynamic and adaptive representations at different levels of abstraction to achieve the optimal trade-off between efficiency and effectiveness.
Specifically, we construct dynamic-resolution particle representations of the environment and learn a unified dynamics model using graph neural networks (GNNs)
We show that our method achieves significantly better performance than state-of-the-art fixed-resolution baselines at the gathering, sorting, and redistribution of granular object piles.
arXiv Detail & Related papers (2023-06-29T05:51:44Z) - Model-Based Visual Planning with Self-Supervised Functional Distances [104.83979811803466]
We present a self-supervised method for model-based visual goal reaching.
Our approach learns entirely using offline, unlabeled data.
We find that this approach substantially outperforms both model-free and model-based prior methods.
arXiv Detail & Related papers (2020-12-30T23:59:09Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.