Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning
- URL: http://arxiv.org/abs/2310.15145v1
- Date: Mon, 23 Oct 2023 17:50:08 GMT
- Title: Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning
- Authors: Jingyun Yang, Max Sobol Mark, Brandon Vu, Archit Sharma, Jeannette
Bohg, Chelsea Finn
- Abstract summary: We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
- Score: 58.3994826169858
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The pre-train and fine-tune paradigm in machine learning has had dramatic
success in a wide range of domains because the use of existing data or
pre-trained models on the internet enables quick and easy learning of new
tasks. We aim to enable this paradigm in robotic reinforcement learning,
allowing a robot to learn a new task with little human effort by leveraging
data and models from the Internet. However, reinforcement learning often
requires significant human effort in the form of manual reward specification or
environment resets, even if the policy is pre-trained. We introduce RoboFuME, a
reset-free fine-tuning system that pre-trains a multi-task manipulation policy
from diverse datasets of prior experiences and self-improves online to learn a
target task with minimal human intervention. Our insights are to utilize
calibrated offline reinforcement learning techniques to ensure efficient online
fine-tuning of a pre-trained policy in the presence of distribution shifts and
leverage pre-trained vision language models (VLMs) to build a robust reward
classifier for autonomously providing reward signals during the online
fine-tuning process. In a diverse set of five real robot manipulation tasks, we
show that our method can incorporate data from an existing robot dataset
collected at a different institution and improve on a target task within as
little as 3 hours of autonomous real-world experience. We also demonstrate in
simulation experiments that our method outperforms prior works that use
different RL algorithms or different approaches for predicting rewards. Project
website: https://robofume.github.io
Related papers
- Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation [8.940998315746684]
We propose a model-based reinforcement learning (RL) approach for robotic arm end-tasks.
We employ Bayesian neural network models to represent, in a probabilistic way, both the belief and information encoded in the dynamic model during exploration.
Our experiments show the advantages of our Bayesian model-based RL approach, with similar quality in the results than relevant alternatives.
arXiv Detail & Related papers (2024-04-02T11:44:37Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - Don't Start From Scratch: Leveraging Prior Data to Automate Robotic
Reinforcement Learning [70.70104870417784]
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems.
In practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment.
In this work, we study how these challenges can be tackled by effective utilization of diverse offline datasets collected from previously seen tasks.
arXiv Detail & Related papers (2022-07-11T08:31:22Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic
Reinforcement Learning [109.77163932886413]
We show how to adapt vision-based robotic manipulation policies to new variations by fine-tuning via off-policy reinforcement learning.
This adaptation uses less than 0.2% of the data necessary to learn the task from scratch.
We find that our approach of adapting pre-trained policies leads to substantial performance gains over the course of fine-tuning.
arXiv Detail & Related papers (2020-04-21T17:57:04Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.