HRPose: Real-Time High-Resolution 6D Pose Estimation Network Using
Knowledge Distillation
- URL: http://arxiv.org/abs/2204.09429v1
- Date: Wed, 20 Apr 2022 12:43:39 GMT
- Title: HRPose: Real-Time High-Resolution 6D Pose Estimation Network Using
Knowledge Distillation
- Authors: Qi Guan, Zihao Sheng, and Shibei Xue
- Abstract summary: We propose an effective and lightweight model, namely High-Resolution 6D Pose Estimation Network (HRPose)
With only 33% of the model size and lower computational costs, our HRPose achieves comparable performance compared with state-of-the-art models.
Numerical experiments on the widely-used benchmark LINEMOD demonstrate the superiority of our proposed HRPose against state-of-the-art methods.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time 6D object pose estimation is essential for many real-world
applications, such as robotic grasping and augmented reality. To achieve an
accurate object pose estimation from RGB images in real-time, we propose an
effective and lightweight model, namely High-Resolution 6D Pose Estimation
Network (HRPose). We adopt the efficient and small HRNetV2-W18 as a feature
extractor to reduce computational burdens while generating accurate 6D poses.
With only 33\% of the model size and lower computational costs, our HRPose
achieves comparable performance compared with state-of-the-art models.
Moreover, by transferring knowledge from a large model to our proposed HRPose
through output and feature-similarity distillations, the performance of our
HRPose is improved in effectiveness and efficiency. Numerical experiments on
the widely-used benchmark LINEMOD demonstrate the superiority of our proposed
HRPose against state-of-the-art methods.
Related papers
- FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose Estimation [0.0]
6D object pose estimation involves determining the three-dimensional translation and rotation of an object within a scene.
Current models, both classical and deep-learning-based, often struggle with the trade-off between accuracy and latency.
We employ several techniques to reduce the model size and improve inference time.
arXiv Detail & Related papers (2024-09-18T12:30:02Z) - Advancing 6D Pose Estimation in Augmented Reality -- Overcoming Projection Ambiguity with Uncontrolled Imagery [0.0]
This study addresses the challenge of accurate 6D pose estimation in Augmented Reality (AR)
We propose a novel approach that strategically decomposes the estimation of z-axis translation and focal length.
This methodology not only streamlines the 6D pose estimation process but also significantly enhances the accuracy of 3D object overlaying in AR settings.
arXiv Detail & Related papers (2024-03-20T09:22:22Z) - FasterPose: A Faster Simple Baseline for Human Pose Estimation [65.8413964785972]
We propose a design paradigm for cost-effective network with LR representation for efficient pose estimation, named FasterPose.
We study the training behavior of FasterPose, and formulate a novel regressive cross-entropy (RCE) loss function for accelerating the convergence.
Compared with the previously dominant network of pose estimation, our method reduces 58% of the FLOPs and simultaneously gains 1.3% improvement of accuracy.
arXiv Detail & Related papers (2021-07-07T13:39:08Z) - EfficientPose: Efficient Human Pose Estimation with Neural Architecture
Search [47.30243595690131]
We propose an efficient framework targeted at human pose estimation including two parts, the efficient backbone and the efficient head.
Our smallest model has only 0.65 GFLOPs with 88.1% PCKh@0.5 on MPII and our large model has only 2 GFLOPs while its accuracy is competitive with the state-of-the-art large model.
arXiv Detail & Related papers (2020-12-13T15:38:38Z) - EfficientHRNet: Efficient Scaling for Lightweight High-Resolution
Multi-Person Pose Estimation [2.924868086534434]
We present EfficientHRNet, a family of lightweight multi-person human pose estimators that are able to perform in real-time on resource-constrained devices.
The largest model is able to come within 4.4% accuracy of the current state-of-the-art, while having 1/3 the model size and 1/6 the power.
Compared to the top real-time approach, EfficientHRNet increases accuracy by 22% while achieving similar FPS with 1/3 the power.
arXiv Detail & Related papers (2020-07-16T03:27:26Z) - Towards Practical Lipreading with Distilled and Efficient Models [57.41253104365274]
Lipreading has witnessed a lot of progress due to the resurgence of neural networks.
Recent works have placed emphasis on aspects such as improving performance by finding the optimal architecture or improving generalization.
There is still a significant gap between the current methodologies and the requirements for an effective deployment of lipreading in practical scenarios.
We propose a series of innovations that significantly bridge that gap: first, we raise the state-of-the-art performance by a wide margin on LRW and LRW-1000 to 88.5% and 46.6%, respectively using self-distillation.
arXiv Detail & Related papers (2020-07-13T16:56:27Z) - PaMIR: Parametric Model-Conditioned Implicit Representation for
Image-based Human Reconstruction [67.08350202974434]
We propose Parametric Model-Conditioned Implicit Representation (PaMIR), which combines the parametric body model with the free-form deep implicit function.
We show that our method achieves state-of-the-art performance for image-based 3D human reconstruction in the cases of challenging poses and clothing types.
arXiv Detail & Related papers (2020-07-08T02:26:19Z) - Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation [97.93687743378106]
Existing 3D pose estimation models suffer performance drop when applying to new scenarios with unseen poses.
We propose a novel framework, Inference Stage Optimization (ISO), for improving the generalizability of 3D pose models.
Remarkably, it yields new state-of-the-art of 83.6% 3D PCK on MPI-INF-3DHP, improving upon the previous best result by 9.7%.
arXiv Detail & Related papers (2020-07-04T09:45:18Z) - EfficientPose: Scalable single-person pose estimation [3.325625311163864]
We propose a novel convolutional neural network architecture, called EfficientPose, for single-person pose estimation.
Our top-performing model achieves state-of-the-art accuracy on single-person MPII, with low-complexity ConvNets.
Due to its low complexity and efficiency, EfficientPose enables real-world applications on edge devices by limiting the memory footprint and computational cost.
arXiv Detail & Related papers (2020-04-25T16:50:46Z) - Self6D: Self-Supervised Monocular 6D Object Pose Estimation [114.18496727590481]
We propose the idea of monocular 6D pose estimation by means of self-supervised learning.
We leverage recent advances in neural rendering to further self-supervise the model on unannotated real RGB-D data.
arXiv Detail & Related papers (2020-04-14T13:16:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.