PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation
with Photometrically Challenging Objects
- URL: http://arxiv.org/abs/2205.08811v1
- Date: Wed, 18 May 2022 09:21:09 GMT
- Title: PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation
with Photometrically Challenging Objects
- Authors: Pengyuan Wang, HyunJun Jung, Yitong Li, Siyuan Shen, Rahul
Parthasarathy Srikanth, Lorenzo Garattoni, Sven Meier, Nassir Navab, Benjamin
Busam
- Abstract summary: We introduce a multimodal dataset for category-level object pose estimation with photometrically challenging objects termed PhoCaL.
PhoCaL comprises 60 high quality 3D models of household objects over 8 categories including highly reflective, transparent and symmetric objects.
It ensures sub-millimeter accuracy of the pose for opaque textured, shiny and transparent objects, no motion blur and perfect camera synchronisation.
- Score: 45.31344700263873
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Object pose estimation is crucial for robotic applications and augmented
reality. Beyond instance level 6D object pose estimation methods, estimating
category-level pose and shape has become a promising trend. As such, a new
research field needs to be supported by well-designed datasets. To provide a
benchmark with high-quality ground truth annotations to the community, we
introduce a multimodal dataset for category-level object pose estimation with
photometrically challenging objects termed PhoCaL. PhoCaL comprises 60 high
quality 3D models of household objects over 8 categories including highly
reflective, transparent and symmetric objects. We developed a novel
robot-supported multi-modal (RGB, depth, polarisation) data acquisition and
annotation process. It ensures sub-millimeter accuracy of the pose for opaque
textured, shiny and transparent objects, no motion blur and perfect camera
synchronisation. To set a benchmark for our dataset, state-of-the-art RGB-D and
monocular RGB methods are evaluated on the challenging scenes of PhoCaL.
Related papers
- Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation [74.44739529186798]
We introduce Omni6D, a comprehensive RGBD dataset featuring a wide range of categories and varied backgrounds.
The dataset comprises an extensive spectrum of 166 categories, 4688 instances adjusted to the canonical pose, and over 0.8 million captures.
We believe this initiative will pave the way for new insights and substantial progress in both the industrial and academic fields.
arXiv Detail & Related papers (2024-09-26T20:13:33Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - MV-ROPE: Multi-view Constraints for Robust Category-level Object Pose and Size Estimation [23.615122326731115]
We propose a novel solution that makes use of RGB video streams.
Our framework consists of three modules: a scale-aware monocular dense SLAM solution, a lightweight object pose predictor, and an object-level pose graph.
Our experimental results demonstrate that when utilizing public dataset sequences with high-quality depth information, the proposed method exhibits comparable performance to state-of-the-art RGB-D methods.
arXiv Detail & Related papers (2023-08-17T08:29:54Z) - HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object
Perception Dataset with Household Objects in Realistic Scenarios [41.54851386729952]
We introduce HouseCat6D, a new category-level 6D pose dataset.
It features 1) multi-modality with Polarimetric RGB and Depth (RGBD+P), 2) encompasses 194 diverse objects across 10 household categories, including two photometrically challenging ones, and 3) provides high-quality pose annotations with an error range of only 1.35 mm to 1.74 mm.
arXiv Detail & Related papers (2022-12-20T17:06:32Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - Object Level Depth Reconstruction for Category Level 6D Object Pose
Estimation From Monocular RGB Image [12.382992538846896]
We propose a novel approach named Object Level Depth reconstruction Network (OLD-Net) taking only RGB images as input for category-level 6D object pose estimation.
We propose to directly predict object-level depth from a monocular RGB image by deforming the category-level shape prior into object-level depth and the canonical NOCS representation.
Experiments on the challenging CAMERA25 and REAL275 datasets indicate that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-04-04T15:33:28Z) - StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose
Estimation [43.839322860501596]
We present a large-scale stereo RGB image object pose estimation dataset named the $textbfStereOBJ-1M$ dataset.
The dataset is designed to address challenging cases such as object transparency, translucency, and specular reflection.
We propose a novel method for efficiently annotating pose data in a multi-view fashion that allows data capturing in complex and flexible environments.
arXiv Detail & Related papers (2021-09-21T11:56:38Z) - DONet: Learning Category-Level 6D Object Pose and Size Estimation from
Depth Observation [53.55300278592281]
We propose a method of Category-level 6D Object Pose and Size Estimation (COPSE) from a single depth image.
Our framework makes inferences based on the rich geometric information of the object in the depth channel alone.
Our framework competes with state-of-the-art approaches that require labeled real-world images.
arXiv Detail & Related papers (2021-06-27T10:41:50Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Pose Estimation of Specular and Symmetrical Objects [0.719973338079758]
In the robotic industry, specular and textureless metallic components are ubiquitous.
The 6D pose estimation of such objects with only a monocular RGB camera is difficult because of the absence of rich texture features.
This paper proposes a data-driven solution to estimate the 6D pose of specular objects for grasping them.
arXiv Detail & Related papers (2020-10-31T22:08:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.