StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose
Estimation
- URL: http://arxiv.org/abs/2109.10115v2
- Date: Wed, 22 Sep 2021 17:38:33 GMT
- Title: StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose
Estimation
- Authors: Xingyu Liu, Shun Iwase, Kris M. Kitani
- Abstract summary: We present a large-scale stereo RGB image object pose estimation dataset named the $textbfStereOBJ-1M$ dataset.
The dataset is designed to address challenging cases such as object transparency, translucency, and specular reflection.
We propose a novel method for efficiently annotating pose data in a multi-view fashion that allows data capturing in complex and flexible environments.
- Score: 43.839322860501596
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a large-scale stereo RGB image object pose estimation dataset
named the $\textbf{StereOBJ-1M}$ dataset. The dataset is designed to address
challenging cases such as object transparency, translucency, and specular
reflection, in addition to the common challenges of occlusion, symmetry, and
variations in illumination and environments. In order to collect data of
sufficient scale for modern deep learning models, we propose a novel method for
efficiently annotating pose data in a multi-view fashion that allows data
capturing in complex and flexible environments. Fully annotated with 6D object
poses, our dataset contains over 396K frames and over 1.5M annotations of 18
objects recorded in 183 scenes constructed in 11 different environments. The 18
objects include 8 symmetric objects, 7 transparent objects, and 8 reflective
objects. We benchmark two state-of-the-art pose estimation frameworks on
StereOBJ-1M as baselines for future work. We also propose a novel object-level
pose optimization method for computing 6D pose from keypoint predictions in
multiple images.
Related papers
- Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation [74.44739529186798]
We introduce Omni6D, a comprehensive RGBD dataset featuring a wide range of categories and varied backgrounds.
The dataset comprises an extensive spectrum of 166 categories, 4688 instances adjusted to the canonical pose, and over 0.8 million captures.
We believe this initiative will pave the way for new insights and substantial progress in both the industrial and academic fields.
arXiv Detail & Related papers (2024-09-26T20:13:33Z) - High-resolution open-vocabulary object 6D pose estimation [30.835921843505123]
Horyon is an open-vocabulary VLM-based architecture that addresses relative pose estimation between two scenes of an unseen object.
We evaluate our model on a benchmark with a large variety of unseen objects across four datasets.
arXiv Detail & Related papers (2024-06-24T07:53:46Z) - HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object
Perception Dataset with Household Objects in Realistic Scenarios [41.54851386729952]
We introduce HouseCat6D, a new category-level 6D pose dataset.
It features 1) multi-modality with Polarimetric RGB and Depth (RGBD+P), 2) encompasses 194 diverse objects across 10 household categories, including two photometrically challenging ones, and 3) provides high-quality pose annotations with an error range of only 1.35 mm to 1.74 mm.
arXiv Detail & Related papers (2022-12-20T17:06:32Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation
with Photometrically Challenging Objects [45.31344700263873]
We introduce a multimodal dataset for category-level object pose estimation with photometrically challenging objects termed PhoCaL.
PhoCaL comprises 60 high quality 3D models of household objects over 8 categories including highly reflective, transparent and symmetric objects.
It ensures sub-millimeter accuracy of the pose for opaque textured, shiny and transparent objects, no motion blur and perfect camera synchronisation.
arXiv Detail & Related papers (2022-05-18T09:21:09Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - CosyPose: Consistent multi-view multi-object 6D pose estimation [48.097599674329004]
We present a single-view single-object 6D pose estimation method, which we use to generate 6D object pose hypotheses.
Second, we develop a robust method for matching individual 6D object pose hypotheses across different input images.
Third, we develop a method for global scene refinement given multiple object hypotheses and their correspondences across views.
arXiv Detail & Related papers (2020-08-19T14:11:56Z) - SIDOD: A Synthetic Image Dataset for 3D Object Pose Recognition with
Distractors [10.546457120988494]
This dataset contains 144k stereo image pairs that synthetically combine 18 camera viewpoints of three photorealistic virtual environments with up to 10 objects.
We describe our approach for domain randomization and provide insight into the decisions that produced the dataset.
arXiv Detail & Related papers (2020-08-12T00:14:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.