HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose
Annotations, Affordances, and Reconstructions
- URL: http://arxiv.org/abs/2308.01477v1
- Date: Wed, 2 Aug 2023 23:59:59 GMT
- Title: HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose
Annotations, Affordances, and Reconstructions
- Authors: Andrew Guo, Bowen Wen, Jianhe Yuan, Jonathan Tremblay, Stephen Tyree,
Jeffrey Smith, Stan Birchfield
- Abstract summary: We present the HANDAL dataset for category-level object pose estimation and affordance prediction.
The dataset consists of 308k annotated image frames from 2.2k videos of 212 real-world objects in 17 categories.
We outline the usefulness of our dataset for 6-DoF category-level pose+scale estimation and related tasks.
- Score: 17.9178233068395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present the HANDAL dataset for category-level object pose estimation and
affordance prediction. Unlike previous datasets, ours is focused on
robotics-ready manipulable objects that are of the proper size and shape for
functional grasping by robot manipulators, such as pliers, utensils, and
screwdrivers. Our annotation process is streamlined, requiring only a single
off-the-shelf camera and semi-automated processing, allowing us to produce
high-quality 3D annotations without crowd-sourcing. The dataset consists of
308k annotated image frames from 2.2k videos of 212 real-world objects in 17
categories. We focus on hardware and kitchen tool objects to facilitate
research in practical scenarios in which a robot manipulator needs to interact
with the environment beyond simple pushing or indiscriminate grasping. We
outline the usefulness of our dataset for 6-DoF category-level pose+scale
estimation and related tasks. We also provide 3D reconstructed meshes of all
objects, and we outline some of the bottlenecks to be addressed for
democratizing the collection of datasets like this one.
Related papers
- ICGNet: A Unified Approach for Instance-Centric Grasping [42.92991092305974]
We introduce an end-to-end architecture for object-centric grasping.
We show the effectiveness of the proposed method by extensively evaluating it against state-of-the-art methods on synthetic datasets.
arXiv Detail & Related papers (2024-01-18T12:41:41Z) - LocaliseBot: Multi-view 3D object localisation with differentiable
rendering for robot grasping [9.690844449175948]
We focus on object pose estimation.
Our approach relies on three pieces of information: multiple views of the object, the camera's parameters at those viewpoints, and 3D CAD models of objects.
We show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates.
arXiv Detail & Related papers (2023-11-14T14:27:53Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast
Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation.
The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects.
Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z) - You Only Look at One: Category-Level Object Representations for Pose
Estimation From a Single Example [26.866356430469757]
We present a method for achieving category-level pose estimation by inspection of just a single object from a desired category.
We demonstrate that our method runs in real-time, enabling a robot manipulator equipped with an RGBD sensor to perform online 6D pose estimation for novel objects.
arXiv Detail & Related papers (2023-05-22T01:32:24Z) - ShapeShift: Superquadric-based Object Pose Estimation for Robotic
Grasping [85.38689479346276]
Current techniques heavily rely on a reference 3D object, limiting their generalizability and making it expensive to expand to new object categories.
This paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.
arXiv Detail & Related papers (2023-04-10T20:55:41Z) - 6-DoF Pose Estimation of Household Objects for Robotic Manipulation: An
Accessible Dataset and Benchmark [17.493403705281008]
We present a new dataset for 6-DoF pose estimation of known objects, with a focus on robotic manipulation research.
We provide 3D scanned textured models of toy grocery objects, as well as RGBD images of the objects in challenging, cluttered scenes.
Using semi-automated RGBD-to-model texture correspondences, the images are annotated with ground truth poses that were verified empirically to be accurate to within a few millimeters.
We also propose a new pose evaluation metric called ADD-H based upon the Hungarian assignment algorithm that is robust to symmetries in object geometry without requiring their explicit enumeration.
arXiv Detail & Related papers (2022-03-11T01:19:04Z) - Semi-automatic 3D Object Keypoint Annotation and Detection for the
Masses [42.34064154798376]
We present a semi-automatic way of collecting and labeling datasets using a wrist mounted camera on a standard robotic arm.
We are able to obtain a working 3D object keypoint detector and go through the whole process of data collection, annotation and learning in just a couple hours of active time.
arXiv Detail & Related papers (2022-01-19T15:41:54Z) - Neural Descriptor Fields: SE(3)-Equivariant Object Representations for
Manipulation [75.83319382105894]
We present Neural Descriptor Fields (NDFs), an object representation that encodes both points and relative poses between an object and a target.
NDFs are trained in a self-supervised fashion via a 3D auto-encoding task that does not rely on expert-labeled keypoints.
Our performance generalizes across both object instances and 6-DoF object poses, and significantly outperforms a recent baseline that relies on 2D descriptors.
arXiv Detail & Related papers (2021-12-09T18:57:15Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z) - Where2Act: From Pixels to Actions for Articulated 3D Objects [54.19638599501286]
We extract highly localized actionable information related to elementary actions such as pushing or pulling for articulated objects with movable parts.
We propose a learning-from-interaction framework with an online data sampling strategy that allows us to train the network in simulation.
Our learned models even transfer to real-world data.
arXiv Detail & Related papers (2021-01-07T18:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.