Related papers: Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning

Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning

URL: http://arxiv.org/abs/2004.10190v2
Date: Fri, 31 Jul 2020 13:43:53 GMT
Title: Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning
Authors: Ryan Julian, Benjamin Swanson, Gaurav S. Sukhatme, Sergey Levine, Chelsea Finn, and Karol Hausman
Abstract summary: We show how to adapt vision-based robotic manipulation policies to new variations by fine-tuning via off-policy reinforcement learning. This adaptation uses less than 0.2% of the data necessary to learn the task from scratch. We find that our approach of adapting pre-trained policies leads to substantial performance gains over the course of fine-tuning.
Score: 109.77163932886413
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the great promises of robot learning systems is that they will be able to learn from their mistakes and continuously adapt to ever-changing environments. Despite this potential, most of the robot learning systems today are deployed as a fixed policy and they are not being adapted after their deployment. Can we efficiently adapt previously learned behaviors to new environments, objects and percepts in the real world? In this paper, we present a method and empirical evidence towards a robot learning framework that facilitates continuous adaption. In particular, we demonstrate how to adapt vision-based robotic manipulation policies to new variations by fine-tuning via off-policy reinforcement learning, including changes in background, object shape and appearance, lighting conditions, and robot morphology. Further, this adaptation uses less than 0.2% of the data necessary to learn the task from scratch. We find that our approach of adapting pre-trained policies leads to substantial performance gains over the course of fine-tuning, and that pre-training via RL is essential: training from scratch or adapting from supervised ImageNet features are both unsuccessful with such small amounts of data. We also find that these positive results hold in a limited continual learning setting, in which we repeatedly fine-tune a single lineage of policies using data from a succession of new tasks. Our empirical conclusions are consistently supported by experiments on simulated manipulation tasks, and by 52 unique fine-tuning experiments on a real robotic grasping system pre-trained on 580,000 grasps.

Related papers

Transferable Latent-to-Latent Locomotion Policy for Efficient and Versatile Motion Control of Diverse Legged Robots [9.837559106057814]
The pretrain-and-finetune paradigm offers a promising approach for efficiently adapting to new robot entities and tasks. We propose a latent training framework where a transferable latent-to-latent locomotion policy is pretrained alongside diverse task-specific observation encoders and action decoders. We validate our approach through extensive simulations and real-world experiments, demonstrating that the pretrained latent-to-latent locomotion policy effectively generalizes to new robot entities and tasks with improved efficiency.
arXiv Detail & Related papers (2025-03-22T03:01:25Z)
Task-free Lifelong Robot Learning with Retrieval-based Weighted Local Adaptation [17.215730187681952]
We store a subset of data from previous tasks and utilize it in two manners: leveraging experience replay to retain learned skills and applying a novel Retrieval-based Local Adaptation technique to restore relevant knowledge. We also incorporate a selective weighting mechanism to focus on the most "forgotten" skill segment, ensuring effective knowledge restoration.
arXiv Detail & Related papers (2024-10-03T21:11:42Z)
VITAL: Visual Teleoperation to Enhance Robot Learning through Human-in-the-Loop Corrections [10.49712834719005]
We propose a low-cost visual teleoperation system for bimanual manipulation tasks, called VITAL. Our approach leverages affordable hardware and visual processing techniques to collect demonstrations. We enhance the generalizability and robustness of the learned policies by utilizing both real and simulated environments.
arXiv Detail & Related papers (2024-07-30T23:29:47Z)
EquiBot: SIM(3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning [36.0274770291531]
We propose Equibot, a robust, data-efficient, and generalizable approach for robot manipulation task learning. Our approach combines SIM(3)-equivariant neural network architectures with diffusion models. We show that our method can easily generalize to novel objects and scenes after learning from just 5 minutes of human demonstrations in each task.
arXiv Detail & Related papers (2024-07-01T17:09:43Z)
Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning. Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy. Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z)
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning [70.70104870417784]
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems. In practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment. In this work, we study how these challenges can be tackled by effective utilization of diverse offline datasets collected from previously seen tasks.
arXiv Detail & Related papers (2022-07-11T08:31:22Z)
Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms [60.59764170868101]
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform. We formulate it as a few-shot meta-learning problem where the goal is to find a model that captures the common structure shared across different robotic platforms. We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots.
arXiv Detail & Related papers (2021-03-05T14:16:20Z)
A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels. We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z)
Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling [126.69933134648541]
We present a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time. Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data.
arXiv Detail & Related papers (2020-06-12T13:34:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.