FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
- URL: http://arxiv.org/abs/2409.16578v2
- Date: Mon, 30 Sep 2024 21:40:17 GMT
- Title: FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
- Authors: Jiaheng Hu, Rose Hendrix, Ali Farhadi, Aniruddha Kembhavi, Roberto Martin-Martin, Peter Stone, Kuo-Hao Zeng, Kiana Ehsani,
- Abstract summary: FLaRe is a large-scale Reinforcement Learning framework that integrates robust pre-trained representations, large-scale training, and gradient stabilization techniques.
Our method aligns pre-trained policies towards task completion, achieving state-of-the-art (SoTA) performance on previously demonstrated and on entirely novel tasks and embodiments.
- Score: 74.25049012472502
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, the Robotics field has initiated several efforts toward building generalist robot policies through large-scale multi-task Behavior Cloning. However, direct deployments of these policies have led to unsatisfactory performance, where the policy struggles with unseen states and tasks. How can we break through the performance plateau of these models and elevate their capabilities to new heights? In this paper, we propose FLaRe, a large-scale Reinforcement Learning fine-tuning framework that integrates robust pre-trained representations, large-scale training, and gradient stabilization techniques. Our method aligns pre-trained policies towards task completion, achieving state-of-the-art (SoTA) performance both on previously demonstrated and on entirely novel tasks and embodiments. Specifically, on a set of long-horizon mobile manipulation tasks, FLaRe achieves an average success rate of 79.5% in unseen environments, with absolute improvements of +23.6% in simulation and +30.7% on real robots over prior SoTA methods. By utilizing only sparse rewards, our approach can enable generalizing to new capabilities beyond the pretraining data with minimal human effort. Moreover, we demonstrate rapid adaptation to new embodiments and behaviors with less than a day of fine-tuning. Videos can be found on the project website at https://robot-flare.github.io/
Related papers
- Single-Shot Learning of Stable Dynamical Systems for Long-Horizon Manipulation Tasks [48.54757719504994]
This paper focuses on improving task success rates while reducing the amount of training data needed.
Our approach introduces a novel method that segments long-horizon demonstrations into discrete steps defined by waypoints and subgoals.
We validate our approach through both simulation and real-world experiments, demonstrating effective transfer from simulation to physical robotic platforms.
arXiv Detail & Related papers (2024-10-01T19:49:56Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own [59.11934130045106]
We propose Reinforcement Learning with Foundation Priors (RLFP) to utilize guidance and feedback from policy, value, and success-reward foundation models.
Within this framework, we introduce the Foundation-guided Actor-Critic (FAC) algorithm, which enables embodied agents to explore more efficiently with automatic reward functions.
Our method achieves remarkable performances in various manipulation tasks on both real robots and in simulation.
arXiv Detail & Related papers (2023-10-04T07:56:42Z) - Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning
During Deployment [25.186525630548356]
Sirius is a principled framework for humans and robots to collaborate through a division of work.
Partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably.
We introduce a new learning algorithm to improve the policy's performance on the data collected from the task executions.
arXiv Detail & Related papers (2022-11-15T18:53:39Z) - Fast Lifelong Adaptive Inverse Reinforcement Learning from
Demonstrations [1.6050172226234585]
We propose a novel LfD framework, Fast Lifelong Adaptive Inverse Reinforcement learning (FLAIR)
We empirically validate that FLAIR achieves adaptability (i.e., the robot adapts to heterogeneous, user-specific task preferences), efficiency (i.e., the robot achieves sample-efficient adaptation), and scalability.
FLAIR surpasses benchmarks across three control tasks with an average 57% improvement in policy returns and an average 78% fewer episodes required for demonstration modeling.
arXiv Detail & Related papers (2022-09-24T02:48:02Z) - Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic
Platforms [60.59764170868101]
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform.
We formulate it as a few-shot meta-learning problem where the goal is to find a model that captures the common structure shared across different robotic platforms.
We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots.
arXiv Detail & Related papers (2021-03-05T14:16:20Z) - Meta Reinforcement Learning-Based Lane Change Strategy for Autonomous
Vehicles [11.180588185127892]
Supervised learning algorithms can generalize to new environments by training on a large amount of labeled data.
It can be often impractical or cost-prohibitive to obtain sufficient data for each new environment.
We propose a meta reinforcement learning (MRL) method to improve the agent's generalization capabilities.
arXiv Detail & Related papers (2020-08-28T02:57:11Z) - Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic
Reinforcement Learning [109.77163932886413]
We show how to adapt vision-based robotic manipulation policies to new variations by fine-tuning via off-policy reinforcement learning.
This adaptation uses less than 0.2% of the data necessary to learn the task from scratch.
We find that our approach of adapting pre-trained policies leads to substantial performance gains over the course of fine-tuning.
arXiv Detail & Related papers (2020-04-21T17:57:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.