Robot Learning of Mobile Manipulation with Reachability Behavior Priors
- URL: http://arxiv.org/abs/2203.04051v1
- Date: Tue, 8 Mar 2022 12:44:42 GMT
- Title: Robot Learning of Mobile Manipulation with Reachability Behavior Priors
- Authors: Snehal Jauhri, Jan Peters, Georgia Chalvatzaki
- Abstract summary: Mobile Manipulation (MM) systems are ideal candidates for taking up the role of a personal assistant in unstructured real-world environments.
Among other challenges, MM requires effective coordination of the robot's embodiments for executing tasks that require both mobility and manipulation.
We study the integration of robotic reachability priors in actor-critic RL methods for accelerating the learning of MM for reaching and fetching tasks.
- Score: 38.49783454634775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mobile Manipulation (MM) systems are ideal candidates for taking up the role
of a personal assistant in unstructured real-world environments. Among other
challenges, MM requires effective coordination of the robot's embodiments for
executing tasks that require both mobility and manipulation. Reinforcement
Learning (RL) holds the promise of endowing robots with adaptive behaviors, but
most methods require prohibitively large amounts of data for learning a useful
control policy. In this work, we study the integration of robotic reachability
priors in actor-critic RL methods for accelerating the learning of MM for
reaching and fetching tasks. Namely, we consider the problem of optimal base
placement and the subsequent decision of whether to activate the arm for
reaching a 6D target. For this, we devise a novel Hybrid RL method that handles
discrete and continuous actions jointly, resorting to the Gumbel-Softmax
reparameterization. Next, we train a reachability prior using data from the
operational robot workspace, inspired by classical methods. Subsequently, we
derive Boosted Hybrid RL (BHyRL), a novel algorithm for learning Q-functions by
modeling them as a sum of residual approximators. Every time a new task needs
to be learned, we can transfer our learned residuals and learn the component of
the Q-function that is task-specific, hence, maintaining the task structure
from prior behaviors. Moreover, we find that regularizing the target policy
with a prior policy yields more expressive behaviors. We evaluate our method in
simulation in reaching and fetching tasks of increasing difficulty, and we show
the superior performance of BHyRL against baseline methods. Finally, we
zero-transfer our learned 6D fetching policy with BHyRL to our MM robot
TIAGo++. For more details and code release, please refer to our project site:
irosalab.com/rlmmbp
Related papers
- Reinforcement Learning with Action Sequence for Data-Efficient Robot Learning [62.3886343725955]
We introduce a novel RL algorithm that learns a critic network that outputs Q-values over a sequence of actions.
By explicitly training the value functions to learn the consequence of executing a series of current and future actions, our algorithm allows for learning useful value functions from noisy trajectories.
arXiv Detail & Related papers (2024-11-19T01:23:52Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous
Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning.
The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping.
Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z) - Few-Shot Preference Learning for Human-in-the-Loop RL [13.773589150740898]
Motivated by the success of meta-learning, we pre-train preference models on prior task data and quickly adapt them for new tasks using only a handful of queries.
We reduce the amount of online feedback needed to train manipulation policies in Meta-World by 20$times$, and demonstrate the effectiveness of our method on a real Franka Panda Robot.
arXiv Detail & Related papers (2022-12-06T23:12:26Z) - Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning
During Deployment [25.186525630548356]
Sirius is a principled framework for humans and robots to collaborate through a division of work.
Partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably.
We introduce a new learning algorithm to improve the policy's performance on the data collected from the task executions.
arXiv Detail & Related papers (2022-11-15T18:53:39Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Learning of Parameters in Behavior Trees for Movement Skills [0.9562145896371784]
Behavior Trees (BTs) can provide a policy representation that supports modular and composable skills.
We present a novel algorithm that can learn the parameters of a BT policy in simulation and then generalize to the physical robot without any additional training.
arXiv Detail & Related papers (2021-09-27T13:46:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.