FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose Estimation
- URL: http://arxiv.org/abs/2409.12720v1
- Date: Wed, 18 Sep 2024 12:30:02 GMT
- Title: FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose Estimation
- Authors: Thomas Pöllabauer, Ashwin Pramod, Volker Knauthe, Michael Wahl,
- Abstract summary: 6D object pose estimation involves determining the three-dimensional translation and rotation of an object within a scene.
Current models, both classical and deep-learning-based, often struggle with the trade-off between accuracy and latency.
We employ several techniques to reduce the model size and improve inference time.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 6D object pose estimation involves determining the three-dimensional translation and rotation of an object within a scene and relative to a chosen coordinate system. This problem is of particular interest for many practical applications in industrial tasks such as quality control, bin picking, and robotic manipulation, where both speed and accuracy are critical for real-world deployment. Current models, both classical and deep-learning-based, often struggle with the trade-off between accuracy and latency. Our research focuses on enhancing the speed of a prominent state-of-the-art deep learning model, GDRNPP, while keeping its high accuracy. We employ several techniques to reduce the model size and improve inference time. These techniques include using smaller and quicker backbones, pruning unnecessary parameters, and distillation to transfer knowledge from a large, high-performing model to a smaller, more efficient student model. Our findings demonstrate that the proposed configuration maintains accuracy comparable to the state-of-the-art while significantly improving inference time. This advancement could lead to more efficient and practical applications in various industrial scenarios, thereby enhancing the overall applicability of 6D Object Pose Estimation models in real-world settings.
Related papers
- EfficientPose 6D: Scalable and Efficient 6D Object Pose Estimation [4.595205112368888]
This study focuses on developing a fast and scalable set of pose estimators based on GDRNPP to meet or exceed current benchmarks in accuracy and robustness.
We propose the AMIS algorithm to tailor the utilized model according to an application-specific trade-off between inference time and accuracy.
arXiv Detail & Related papers (2025-02-19T19:21:23Z) - Advanced Object Detection and Pose Estimation with Hybrid Task Cascade and High-Resolution Networks [7.403403805442553]
This study proposes an improved 6D object detection and pose estimation pipeline based on the existing 6D-VNet framework.
By leveraging the strengths of HTC's multi-stage refinement process and HRNet's ability to maintain high-resolution representations, our approach significantly improves detection accuracy and pose estimation precision.
arXiv Detail & Related papers (2025-02-06T08:48:34Z) - 6DOPE-GS: Online 6D Object Pose Estimation using Gaussian Splatting [7.7145084897748974]
We present 6DOPE-GS, a novel method for online 6D object pose estimation & tracking with a single RGB-D camera.
We show that 6DOPE-GS matches the performance of state-of-the-art baselines for model-free simultaneous 6D pose tracking and reconstruction.
We also demonstrate the method's suitability for live, dynamic object tracking and reconstruction in a real-world setting.
arXiv Detail & Related papers (2024-12-02T14:32:19Z) - Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding [55.32861154245772]
Calib3D is a pioneering effort to benchmark and scrutinize the reliability of 3D scene understanding models.
We comprehensively evaluate 28 state-of-the-art models across 10 diverse 3D datasets.
We introduce DeptS, a novel depth-aware scaling approach aimed at enhancing 3D model calibration.
arXiv Detail & Related papers (2024-03-25T17:59:59Z) - Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation [66.3814684757376]
This work presents Zero123-6D, the first work to demonstrate the utility of Diffusion Model-based novel-view-synthesizers in enhancing RGB 6D pose estimation at category-level.
The outlined method shows reduction in data requirements, removal of the necessity of depth information in zero-shot category-level 6D pose estimation task, and increased performance, quantitatively demonstrated through experiments on the CO3D dataset.
arXiv Detail & Related papers (2024-03-21T10:38:18Z) - Advancing 6D Pose Estimation in Augmented Reality -- Overcoming Projection Ambiguity with Uncontrolled Imagery [0.0]
This study addresses the challenge of accurate 6D pose estimation in Augmented Reality (AR)
We propose a novel approach that strategically decomposes the estimation of z-axis translation and focal length.
This methodology not only streamlines the 6D pose estimation process but also significantly enhances the accuracy of 3D object overlaying in AR settings.
arXiv Detail & Related papers (2024-03-20T09:22:22Z) - EasyHeC: Accurate and Automatic Hand-eye Calibration via Differentiable
Rendering and Space Exploration [49.90228618894857]
We introduce a new approach to hand-eye calibration called EasyHeC, which is markerless, white-box, and delivers superior accuracy and robustness.
We propose to use two key technologies: differentiable rendering-based camera pose optimization and consistency-based joint space exploration.
Our evaluation demonstrates superior performance in synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-02T03:49:54Z) - DeepRM: Deep Recurrent Matching for 6D Pose Refinement [77.34726150561087]
DeepRM is a novel recurrent network architecture for 6D pose refinement.
The architecture incorporates LSTM units to propagate information through each refinement step.
DeepRM achieves state-of-the-art performance on two widely accepted challenging datasets.
arXiv Detail & Related papers (2022-05-28T16:18:08Z) - HRPose: Real-Time High-Resolution 6D Pose Estimation Network Using
Knowledge Distillation [0.0]
We propose an effective and lightweight model, namely High-Resolution 6D Pose Estimation Network (HRPose)
With only 33% of the model size and lower computational costs, our HRPose achieves comparable performance compared with state-of-the-art models.
Numerical experiments on the widely-used benchmark LINEMOD demonstrate the superiority of our proposed HRPose against state-of-the-art methods.
arXiv Detail & Related papers (2022-04-20T12:43:39Z) - Knowledge distillation: A good teacher is patient and consistent [71.14922743774864]
There is a growing discrepancy in computer vision between large-scale models that achieve state-of-the-art performance and models that are affordable in practical applications.
We identify certain implicit design choices, which may drastically affect the effectiveness of distillation.
We obtain a state-of-the-art ResNet-50 model for ImageNet, which achieves 82.8% top-1 accuracy.
arXiv Detail & Related papers (2021-06-09T17:20:40Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.