Imitrob: Imitation Learning Dataset for Training and Evaluating 6D
Object Pose Estimators
- URL: http://arxiv.org/abs/2209.07976v3
- Date: Wed, 5 Apr 2023 17:30:35 GMT
- Title: Imitrob: Imitation Learning Dataset for Training and Evaluating 6D
Object Pose Estimators
- Authors: Jiri Sedlar, Karla Stepanova, Radoslav Skoviera, Jan K. Behrens, Matus
Tuna, Gabriela Sejnova, Josef Sivic, Robert Babuska
- Abstract summary: This paper introduces a dataset for training and evaluating methods for 6D pose estimation of hand-held tools in task demonstrations captured by a standard RGB camera.
The dataset contains image sequences of nine different tools and twelve manipulation tasks with two camera viewpoints, four human subjects, and left/right hand.
- Score: 20.611000416051546
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper introduces a dataset for training and evaluating methods for 6D
pose estimation of hand-held tools in task demonstrations captured by a
standard RGB camera. Despite the significant progress of 6D pose estimation
methods, their performance is usually limited for heavily occluded objects,
which is a common case in imitation learning, where the object is typically
partially occluded by the manipulating hand. Currently, there is a lack of
datasets that would enable the development of robust 6D pose estimation methods
for these conditions. To overcome this problem, we collect a new dataset
(Imitrob) aimed at 6D pose estimation in imitation learning and other
applications where a human holds a tool and performs a task. The dataset
contains image sequences of nine different tools and twelve manipulation tasks
with two camera viewpoints, four human subjects, and left/right hand. Each
image is accompanied by an accurate ground truth measurement of the 6D object
pose obtained by the HTC Vive motion tracking device. The use of the dataset is
demonstrated by training and evaluating a recent 6D object pose estimation
method (DOPE) in various setups.
Related papers
- ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics [55.85916671269219]
This paper introduces ManiPose, a pioneering benchmark designed to advance the study of pose-varying manipulation tasks.
A comprehensive dataset features geometrically consistent and manipulation-oriented 6D pose labels for 2936 real-world scanned rigid objects and 100 articulated objects.
Our benchmark demonstrates notable advancements in pose estimation, pose-aware manipulation, and real-robot skill transfer.
arXiv Detail & Related papers (2024-03-20T07:48:32Z) - ZS6D: Zero-shot 6D Object Pose Estimation using Vision Transformers [9.899633398596672]
We introduce ZS6D, for zero-shot novel object 6D pose estimation.
Visual descriptors, extracted using pre-trained Vision Transformers (ViT), are used for matching rendered templates.
Experiments are performed on LMO, YCBV, and TLESS datasets.
arXiv Detail & Related papers (2023-09-21T11:53:01Z) - Rigidity-Aware Detection for 6D Object Pose Estimation [60.88857851869196]
Most recent 6D object pose estimation methods first use object detection to obtain 2D bounding boxes before actually regressing the pose.
We propose a rigidity-aware detection method exploiting the fact that, in 6D pose estimation, the target objects are rigid.
Key to the success of our approach is a visibility map, which we propose to build using a minimum barrier distance between every pixel in the bounding box and the box boundary.
arXiv Detail & Related papers (2023-03-22T09:02:54Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose
Estimation [12.773040823634908]
We propose a universal framework, called OVE6D, for model-based 6D object pose estimation from a single depth image and a target object mask.
Our model is trained using purely synthetic data rendered from ShapeNet, and, unlike most of the existing methods, it generalizes well on new real-world objects without any fine-tuning.
We show that OVE6D outperforms some contemporary deep learning-based pose estimation methods specifically trained for individual objects or datasets with real-world training data.
arXiv Detail & Related papers (2022-03-02T12:51:33Z) - VIPose: Real-time Visual-Inertial 6D Object Pose Tracking [3.44942675405441]
We introduce a novel Deep Neural Network (DNN) called VIPose to address the object pose tracking problem in real-time.
The key contribution is the design of a novel DNN architecture which fuses visual and inertial features to predict the objects' relative 6D pose.
The approach presents accuracy performances comparable to state-of-the-art techniques, but with additional benefit to be real-time.
arXiv Detail & Related papers (2021-07-27T06:10:23Z) - Spatial Attention Improves Iterative 6D Object Pose Estimation [52.365075652976735]
We propose a new method for 6D pose estimation refinement from RGB images.
Our main insight is that after the initial pose estimate, it is important to pay attention to distinct spatial features of the object.
We experimentally show that this approach learns to attend to salient spatial features and learns to ignore occluded parts of the object, leading to better pose estimation across datasets.
arXiv Detail & Related papers (2021-01-05T17:18:52Z) - 3D Registration for Self-Occluded Objects in Context [66.41922513553367]
We introduce the first deep learning framework capable of effectively handling this scenario.
Our method consists of an instance segmentation module followed by a pose estimation one.
It allows us to perform 3D registration in a one-shot manner, without requiring an expensive iterative procedure.
arXiv Detail & Related papers (2020-11-23T08:05:28Z) - Single Shot 6D Object Pose Estimation [11.37625512264302]
We introduce a novel single shot approach for 6D object pose estimation of rigid objects based on depth images.
A fully convolutional neural network is employed, where the 3D input data is spatially discretized and pose estimation is considered as a regression task.
With 65 fps on a GPU, our Object Pose Network (OP-Net) is extremely fast, is optimized end-to-end, and estimates the 6D pose of multiple objects in the image simultaneously.
arXiv Detail & Related papers (2020-04-27T11:59:11Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.