Related papers: CHIP: A multi-sensor dataset for 6D pose estimation of chairs in industrial settings

CHIP: A multi-sensor dataset for 6D pose estimation of chairs in industrial settings

URL: http://arxiv.org/abs/2506.09699v1
Date: Wed, 11 Jun 2025 13:13:31 GMT
Title: CHIP: A multi-sensor dataset for 6D pose estimation of chairs in industrial settings
Authors: Mattia Nardon, Mikel Mujika Agirre, Ander González Tomé, Daniel Sedano Algarabel, Josep Rueda Collell, Ana Paola Caro, Andrea Caraffa, Fabio Poiesi, Paul Ian Chippendale, Davide Boscaini,
Abstract summary: CHIP is the first dataset designed for 6D pose estimation of chairs in a real-world industrial environment.<n> CHIP comprises 77,811 RGBD images annotated with ground-truth 6D poses automatically derived from the robot's kinematics.<n>Results show substantial room for improvement, highlighting the unique challenges posed by the dataset.
Score: 4.310149395049504
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate 6D pose estimation of complex objects in 3D environments is essential for effective robotic manipulation. Yet, existing benchmarks fall short in evaluating 6D pose estimation methods under realistic industrial conditions, as most datasets focus on household objects in domestic settings, while the few available industrial datasets are limited to artificial setups with objects placed on tables. To bridge this gap, we introduce CHIP, the first dataset designed for 6D pose estimation of chairs manipulated by a robotic arm in a real-world industrial environment. CHIP includes seven distinct chairs captured using three different RGBD sensing technologies and presents unique challenges, such as distractor objects with fine-grained differences and severe occlusions caused by the robotic arm and human operators. CHIP comprises 77,811 RGBD images annotated with ground-truth 6D poses automatically derived from the robot's kinematics, averaging 11,115 annotations per chair. We benchmark CHIP using three zero-shot 6D pose estimation methods, assessing performance across different sensor types, localization priors, and occlusion levels. Results show substantial room for improvement, highlighting the unique challenges posed by the dataset. CHIP will be publicly released.

Related papers

SDT-6D: Fully Sparse Depth-Transformer for Staged End-to-End 6D Pose Estimation in Industrial Multi-View Bin Picking [0.0]
We introduce a holistic depth-only 6D pose estimation approach that fuses multi-view depth maps into a fine-grained 3D point cloud.<n>Our framework operates fully sparse, enabling high-resolution representations to capture fine geometric details crucial for accurate pose estimation in clutter.<n>We validate our method on the recently published IPD and MV-YCB multi-view datasets, demonstrating competitive performance in heavily cluttered industrial and household bin picking scenarios.
arXiv Detail & Related papers (2025-12-09T09:58:35Z)
IMD: A 6-DoF Pose Estimation Benchmark for Industrial Metallic Objects [4.959150853096882]
We propose a novel dataset and benchmark namely textitIndustrial Metallic dataset (IMD), tailored for industrial applications.<n>Our dataset comprises 45 true-to-scale industrial components, captured with an RGB-D camera under natural indoor lighting.<n>The benchmark supports three tasks, including video object segmentation, 6D pose tracking, and one-shot 6D pose estimation.
arXiv Detail & Related papers (2025-09-15T08:28:15Z)
MR6D: Benchmarking 6D Pose Estimation for Mobile Robots [0.118749525824656]
Existing 6D pose estimation datasets primarily focus on small household objects typically handled by robot arm manipulators.<n>We introduce MR6D, a dataset designed for 6D pose estimation for mobile robots in industrial environments.
arXiv Detail & Related papers (2025-08-19T12:21:34Z)
XYZ-IBD: High-precision Bin-picking Dataset for Object 6D Pose Estimation Capturing Real-world Industrial Complexity [46.05421425745179]
XYZ-IBD is a bin-picking dataset for 6D pose estimation.<n>It reflects authentic robotic manipulation scenarios with millimeter-accurate annotations.<n>The dataset features 15 texture-less, metallic, and mostly symmetrical objects of varying shapes and sizes.
arXiv Detail & Related papers (2025-05-31T15:15:27Z)
Any6D: Model-free 6D Pose Estimation of Novel Objects [76.30057578269668]
We introduce Any6D, a model-free framework for 6D object pose estimation.<n>It requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes.<n>We evaluate our method on five challenging datasets.
arXiv Detail & Related papers (2025-03-24T13:46:21Z)
Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation [66.3814684757376]
This work presents Zero123-6D, the first work to demonstrate the utility of Diffusion Model-based novel-view-synthesizers in enhancing RGB 6D pose estimation at category-level. The outlined method shows reduction in data requirements, removal of the necessity of depth information in zero-shot category-level 6D pose estimation task, and increased performance, quantitatively demonstrated through experiments on the CO3D dataset.
arXiv Detail & Related papers (2024-03-21T10:38:18Z)
Advancing 6D Pose Estimation in Augmented Reality -- Overcoming Projection Ambiguity with Uncontrolled Imagery [0.0]
This study addresses the challenge of accurate 6D pose estimation in Augmented Reality (AR) We propose a novel approach that strategically decomposes the estimation of z-axis translation and focal length. This methodology not only streamlines the 6D pose estimation process but also significantly enhances the accuracy of 3D object overlaying in AR settings.
arXiv Detail & Related papers (2024-03-20T09:22:22Z)
ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics [55.85916671269219]
This paper introduces ManiPose, a pioneering benchmark designed to advance the study of pose-varying manipulation tasks. A comprehensive dataset features geometrically consistent and manipulation-oriented 6D pose labels for 2936 real-world scanned rigid objects and 100 articulated objects. Our benchmark demonstrates notable advancements in pose estimation, pose-aware manipulation, and real-robot skill transfer.
arXiv Detail & Related papers (2024-03-20T07:48:32Z)
Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects. We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z)
Weakly Supervised Multi-Modal 3D Human Body Pose Estimation for Autonomous Driving [0.5735035463793008]
3D human pose estimation is crucial for enabling autonomous vehicles (AVs) to make informed decisions and respond proactively in critical road scenarios. We present a simple yet efficient weakly supervised approach for 3D HPE in the AV context by employing a high-level sensor fusion between camera and LiDAR data. Our approach outperforms state-of-the-art results by up to $sim$ 13% on the Open dataset in the weakly supervised setting.
arXiv Detail & Related papers (2023-07-27T14:28:50Z)
Towards Multimodal Multitask Scene Understanding Models for Indoor Mobile Agents [49.904531485843464]
In this paper, we discuss the main challenge: insufficient, or even no, labeled data for real-world indoor environments. We describe MMISM (Multi-modality input Multi-task output Indoor Scene understanding Model) to tackle the above challenges. MMISM considers RGB images as well as sparse Lidar points as inputs and 3D object detection, depth completion, human pose estimation, and semantic segmentation as output tasks. We show that MMISM performs on par or even better than single-task models.
arXiv Detail & Related papers (2022-09-27T04:49:19Z)
Learning 6D Pose Estimation from Synthetic RGBD Images for Robotic Applications [0.6299766708197883]
The proposed pipeline can efficiently generate large amounts of photo-realistic RGBD images for the object of interest. We develop a real-time two-stage 6D pose estimation approach by integrating the object detector YOLO-V4-tiny and the 6D pose estimation algorithm PVN3D. The resulting network shows competitive performance compared to state-of-the-art methods when evaluated on LineMod dataset.
arXiv Detail & Related papers (2022-08-30T14:17:15Z)
6D Robotic Assembly Based on RGB-only Object Pose Estimation [35.74647604582182]
We propose an integrated 6D robotic system to perceive, grasp, manipulate and assemble blocks with tight tolerances. Our system is built upon a monocular 6D object pose estimation network trained solely with synthetic images leveraging physically-based rendering.
arXiv Detail & Related papers (2022-08-27T11:26:24Z)
SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation [98.83762558394345]
SO-Pose is a framework for regressing all 6 degrees-of-freedom (6DoF) for the object pose in a cluttered environment from a single RGB image. We introduce a novel reasoning about self-occlusion, in order to establish a two-layer representation for 3D objects. Cross-layer consistencies that align correspondences, self-occlusion and 6D pose, we can further improve accuracy and robustness.
arXiv Detail & Related papers (2021-08-18T19:49:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.