Neural Descriptor Fields: SE(3)-Equivariant Object Representations for
Manipulation
- URL: http://arxiv.org/abs/2112.05124v1
- Date: Thu, 9 Dec 2021 18:57:15 GMT
- Title: Neural Descriptor Fields: SE(3)-Equivariant Object Representations for
Manipulation
- Authors: Anthony Simeonov, Yilun Du, Andrea Tagliasacchi, Joshua B. Tenenbaum,
Alberto Rodriguez, Pulkit Agrawal, Vincent Sitzmann
- Abstract summary: We present Neural Descriptor Fields (NDFs), an object representation that encodes both points and relative poses between an object and a target.
NDFs are trained in a self-supervised fashion via a 3D auto-encoding task that does not rely on expert-labeled keypoints.
Our performance generalizes across both object instances and 6-DoF object poses, and significantly outperforms a recent baseline that relies on 2D descriptors.
- Score: 75.83319382105894
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Neural Descriptor Fields (NDFs), an object representation that
encodes both points and relative poses between an object and a target (such as
a robot gripper or a rack used for hanging) via category-level descriptors. We
employ this representation for object manipulation, where given a task
demonstration, we want to repeat the same task on a new object instance from
the same category. We propose to achieve this objective by searching (via
optimization) for the pose whose descriptor matches that observed in the
demonstration. NDFs are conveniently trained in a self-supervised fashion via a
3D auto-encoding task that does not rely on expert-labeled keypoints. Further,
NDFs are SE(3)-equivariant, guaranteeing performance that generalizes across
all possible 3D object translations and rotations. We demonstrate learning of
manipulation tasks from few (5-10) demonstrations both in simulation and on a
real robot. Our performance generalizes across both object instances and 6-DoF
object poses, and significantly outperforms a recent baseline that relies on 2D
descriptors. Project website: https://yilundu.github.io/ndf/.
Related papers
- Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers [65.51132104404051]
We introduce the use of object identifiers and object-centric representations to interact with scenes at the object level.
Our model significantly outperforms existing methods on benchmarks including ScanRefer, Multi3DRefer, Scan2Cap, ScanQA, and SQA3D.
arXiv Detail & Related papers (2023-12-13T14:27:45Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - ShapeShift: Superquadric-based Object Pose Estimation for Robotic
Grasping [85.38689479346276]
Current techniques heavily rely on a reference 3D object, limiting their generalizability and making it expensive to expand to new object categories.
This paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.
arXiv Detail & Related papers (2023-04-10T20:55:41Z) - USEEK: Unsupervised SE(3)-Equivariant 3D Keypoints for Generalizable
Manipulation [19.423310410631085]
U.S.EEK is an unsupervised SE(3)-equivariant keypoints method that enjoys alignment across instances in a category.
With USEEK in hand, the robot can infer the category-level task-relevant object frames in an efficient and explainable manner.
arXiv Detail & Related papers (2022-09-28T06:42:29Z) - Object-Compositional Neural Implicit Surfaces [45.274466719163925]
The neural implicit representation has shown its effectiveness in novel view synthesis and high-quality 3D reconstruction from multi-view images.
This paper proposes a novel framework, ObjectSDF, to build an object-compositional neural implicit representation with high fidelity in 3D reconstruction and object representation.
arXiv Detail & Related papers (2022-07-20T06:38:04Z) - Point2Seq: Detecting 3D Objects as Sequences [58.63662049729309]
We present a simple and effective framework, named Point2Seq, for 3D object detection from point clouds.
We view each 3D object as a sequence of words and reformulate the 3D object detection task as decoding words from 3D scenes in an auto-regressive manner.
arXiv Detail & Related papers (2022-03-25T00:20:31Z) - Supervised Training of Dense Object Nets using Optimal Descriptors for
Industrial Robotic Applications [57.87136703404356]
Dense Object Nets (DONs) by Florence, Manuelli and Tedrake introduced dense object descriptors as a novel visual object representation for the robotics community.
In this paper we show that given a 3D model of an object, we can generate its descriptor space image, which allows for supervised training of DONs.
We compare the training methods on generating 6D grasps for industrial objects and show that our novel supervised training approach improves the pick-and-place performance in industry-relevant tasks.
arXiv Detail & Related papers (2021-02-16T11:40:12Z) - Rapid Pose Label Generation through Sparse Representation of Unknown
Objects [7.32172860877574]
This work presents an approach for rapidly generating real-world, pose-annotated RGB-D data for unknown objects.
We first source minimalistic labelings of an ordered set of arbitrarily chosen keypoints over a set of RGB-D videos.
By solving an optimization problem, we combine these labels under a world frame to recover a sparse, keypoint-based representation of the object.
arXiv Detail & Related papers (2020-11-07T15:14:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.