NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics via
Novel-View Synthesis
- URL: http://arxiv.org/abs/2301.08556v1
- Date: Wed, 18 Jan 2023 23:25:27 GMT
- Title: NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics via
Novel-View Synthesis
- Authors: Allan Zhou, Moo Jin Kim, Lirui Wang, Pete Florence, Chelsea Finn
- Abstract summary: SPARTN (Synthetic Perturbations for Augmenting Robot Trajectories via NeRF) is a fully-offline data augmentation scheme for improving robot policies.
Our approach leverages neural radiance fields (NeRFs) to synthetically inject corrective noise into visual demonstrations.
In a simulated 6-DoF visual grasping benchmark, SPARTN improves success rates by 2.8$times$ over imitation learning without the corrective augmentations.
- Score: 50.93065653283523
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Expert demonstrations are a rich source of supervision for training visual
robotic manipulation policies, but imitation learning methods often require
either a large number of demonstrations or expensive online expert supervision
to learn reactive closed-loop behaviors. In this work, we introduce SPARTN
(Synthetic Perturbations for Augmenting Robot Trajectories via NeRF): a
fully-offline data augmentation scheme for improving robot policies that use
eye-in-hand cameras. Our approach leverages neural radiance fields (NeRFs) to
synthetically inject corrective noise into visual demonstrations, using NeRFs
to generate perturbed viewpoints while simultaneously calculating the
corrective actions. This requires no additional expert supervision or
environment interaction, and distills the geometric information in NeRFs into a
real-time reactive RGB-only policy. In a simulated 6-DoF visual grasping
benchmark, SPARTN improves success rates by 2.8$\times$ over imitation learning
without the corrective augmentations and even outperforms some methods that use
online supervision. It additionally closes the gap between RGB-only and RGB-D
success rates, eliminating the previous need for depth sensors. In real-world
6-DoF robotic grasping experiments from limited human demonstrations, our
method improves absolute success rates by $22.5\%$ on average, including
objects that are traditionally challenging for depth-based methods. See video
results at \url{https://bland.website/spartn}.
Related papers
- Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills.
We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval.
GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z) - Render and Diffuse: Aligning Image and Action Spaces for Diffusion-based Behaviour Cloning [15.266994159289645]
We introduce Render and Diffuse (R&D) a method that unifies low-level robot actions and RGB observations within the image space using virtual renders of the 3D model of the robot.
This space unification simplifies the learning problem and introduces inductive biases that are crucial for sample efficiency and spatial generalisation.
Our results show that R&D exhibits strong spatial generalisation capabilities and is more sample efficient than more common image-to-action methods.
arXiv Detail & Related papers (2024-05-28T14:06:10Z) - What Matters to You? Towards Visual Representation Alignment for Robot
Learning [81.30964736676103]
When operating in service of people, robots need to optimize rewards aligned with end-user preferences.
We propose Representation-Aligned Preference-based Learning (RAPL), a method for solving the visual representation alignment problem.
arXiv Detail & Related papers (2023-10-11T23:04:07Z) - Giving Robots a Hand: Learning Generalizable Manipulation with
Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation.
Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation.
In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z) - Training Robots without Robots: Deep Imitation Learning for
Master-to-Robot Policy Transfer [4.318590074766604]
Deep imitation learning is promising for robot manipulation because it only requires demonstration samples.
Existing demonstration methods have deficiencies; bilateral teleoperation requires a complex control scheme and is expensive.
This research proposes a new master-to-robot (M2R) policy transfer system that does not require robots for teaching force feedback-based manipulation tasks.
arXiv Detail & Related papers (2022-02-19T10:55:10Z) - Towards a Sample Efficient Reinforcement Learning Pipeline for Vision
Based Robotics [0.0]
We study how to limit the time taken for training a robotic arm to reach a ball from scratch by assembling a pipeline as efficient as possible.
The pipeline is divided into two parts: the first one is to capture the relevant information from the RGB video with a Computer Vision algorithm.
The second one studies how to train faster a Deep Reinforcement Learning algorithm in order to make the robotic arm reach the target in front of him.
arXiv Detail & Related papers (2021-05-20T13:13:01Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.