Related papers: HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object Perception Dataset with Household Objects in Realistic Scenarios

HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object Perception Dataset with Household Objects in Realistic Scenarios

URL: http://arxiv.org/abs/2212.10428v5
Date: Fri, 1 Dec 2023 13:23:47 GMT
Title: HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object Perception Dataset with Household Objects in Realistic Scenarios
Authors: HyunJun Jung, Guangyao Zhai, Shun-Cheng Wu, Patrick Ruhkamp, Hannah Schieber, Giulia Rizzoli, Pengyuan Wang, Hongcheng Zhao, Lorenzo Garattoni, Sven Meier, Daniel Roth, Nassir Navab, Benjamin Busam
Abstract summary: We introduce HouseCat6D, a new category-level 6D pose dataset. It features 1) multi-modality with Polarimetric RGB and Depth (RGBD+P), 2) encompasses 194 diverse objects across 10 household categories, including two photometrically challenging ones, and 3) provides high-quality pose annotations with an error range of only 1.35 mm to 1.74 mm.
Score: 41.54851386729952
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Estimating 6D object poses is a major challenge in 3D computer vision. Building on successful instance-level approaches, research is shifting towards category-level pose estimation for practical applications. Current category-level datasets, however, fall short in annotation quality and pose variety. Addressing this, we introduce HouseCat6D, a new category-level 6D pose dataset. It features 1) multi-modality with Polarimetric RGB and Depth (RGBD+P), 2) encompasses 194 diverse objects across 10 household categories, including two photometrically challenging ones, and 3) provides high-quality pose annotations with an error range of only 1.35 mm to 1.74 mm. The dataset also includes 4) 41 large-scale scenes with comprehensive viewpoint and occlusion coverage, 5) a checkerboard-free environment, and 6) dense 6D parallel-jaw robotic grasp annotations. Additionally, we present benchmark results for leading category-level pose estimation networks.

Related papers

One2Any: One-Reference 6D Pose Estimation for Any Object [98.50085481362808]
6D object pose estimation remains challenging for many applications due to dependencies on complete 3D models, multi-view images, or training limited to specific object categories.<n>We propose a novel method One2Any that estimates the relative 6-degrees of freedom (DOF) object pose using only a single reference-single query RGB-D image.<n> Experiments on multiple benchmark datasets demonstrate that our model generalizes well to novel objects, achieving state-of-the-art accuracy and even rivaling methods that require multi-view or CAD inputs, at a fraction of compute.
arXiv Detail & Related papers (2025-05-07T03:54:59Z)
Any6D: Model-free 6D Pose Estimation of Novel Objects [76.30057578269668]
We introduce Any6D, a model-free framework for 6D object pose estimation. It requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes. We evaluate our method on five challenging datasets.
arXiv Detail & Related papers (2025-03-24T13:46:21Z)
Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation [74.44739529186798]
We introduce Omni6D, a comprehensive RGBD dataset featuring a wide range of categories and varied backgrounds. The dataset comprises an extensive spectrum of 166 categories, 4688 instances adjusted to the canonical pose, and over 0.8 million captures. We believe this initiative will pave the way for new insights and substantial progress in both the industrial and academic fields.
arXiv Detail & Related papers (2024-09-26T20:13:33Z)
Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking [9.365544189576363]
6D Object Pose Estimation is a crucial yet challenging task in computer vision, suffering from a significant lack of large-scale datasets. This paper introduces Omni6DPose, a dataset characterized by its diversity in object categories, large scale, and variety in object materials. We introduce GenPose++, an enhanced version of the SOTA category-level pose estimation framework, incorporating two pivotal improvements.
arXiv Detail & Related papers (2024-06-06T17:57:20Z)
PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation with Photometrically Challenging Objects [45.31344700263873]
We introduce a multimodal dataset for category-level object pose estimation with photometrically challenging objects termed PhoCaL. PhoCaL comprises 60 high quality 3D models of household objects over 8 categories including highly reflective, transparent and symmetric objects. It ensures sub-millimeter accuracy of the pose for opaque textured, shiny and transparent objects, no motion blur and perfect camera synchronisation.
arXiv Detail & Related papers (2022-05-18T09:21:09Z)
Coupled Iterative Refinement for 6D Multi-Object Pose Estimation [64.7198752089041]
Given a set of known 3D objects and an RGB or RGB-D input image, we detect and estimate the 6D pose of each object. Our approach iteratively refines both pose and correspondence in a tightly coupled manner, allowing us to dynamically remove outliers to improve accuracy.
arXiv Detail & Related papers (2022-04-26T18:00:08Z)
GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting [103.74918834553249]
GPV-Pose is a novel framework for robust category-level pose estimation. It harnesses geometric insights to enhance the learning of category-level pose-sensitive features. It produces superior results to state-of-the-art competitors on common public benchmarks.
arXiv Detail & Related papers (2022-03-15T13:58:50Z)
Weakly Supervised Learning of Keypoints for 6D Object Pose Estimation [73.40404343241782]
We propose a weakly supervised 6D object pose estimation approach based on 2D keypoint detection. Our approach achieves comparable performance with state-of-the-art fully supervised approaches.
arXiv Detail & Related papers (2022-03-07T16:23:47Z)
StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation [43.839322860501596]
We present a large-scale stereo RGB image object pose estimation dataset named the $textbfStereOBJ-1M$ dataset. The dataset is designed to address challenging cases such as object transparency, translucency, and specular reflection. We propose a novel method for efficiently annotating pose data in a multi-view fashion that allows data capturing in complex and flexible environments.
arXiv Detail & Related papers (2021-09-21T11:56:38Z)
Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image [27.234658117816103]
We propose a single-stage, keypoint-based approach for category-level object pose estimation. The proposed network performs 2D object detection, detects 2D keypoints, estimates 6-DoF pose, and regresses relative bounding cuboid dimensions. We conduct extensive experiments on the challenging Objectron benchmark, outperforming state-of-the-art methods on the 3D IoU metric.
arXiv Detail & Related papers (2021-09-13T17:55:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.