Mesh Represented Recycle Learning for 3D Hand Pose and Mesh Estimation
- URL: http://arxiv.org/abs/2310.12189v1
- Date: Wed, 18 Oct 2023 09:50:09 GMT
- Title: Mesh Represented Recycle Learning for 3D Hand Pose and Mesh Estimation
- Authors: Bosang Kim, Jonghyun Kim, Hyotae Lee, Lanying Jin, Jeongwon Ha, Dowoo
Kwon, Jungpyo Kim, Wonhyeok Im, KyungMin Jin, and Jungho Lee
- Abstract summary: We propose a mesh represented recycle learning strategy for 3D hand pose and mesh estimation.
To be specific, a hand pose and mesh estimation model first predicts parametric 3D hand annotations.
Second, synthetic hand images are generated with self-estimated hand mesh representations.
Third, the synthetic hand images are fed into the same model again.
- Score: 3.126179109712709
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In general, hand pose estimation aims to improve the robustness of model
performance in the real-world scenes. However, it is difficult to enhance the
robustness since existing datasets are obtained in restricted environments to
annotate 3D information. Although neural networks quantitatively achieve a high
estimation accuracy, unsatisfied results can be observed in visual quality.
This discrepancy between quantitative results and their visual qualities
remains an open issue in the hand pose representation. To this end, we propose
a mesh represented recycle learning strategy for 3D hand pose and mesh
estimation which reinforces synthesized hand mesh representation in a training
phase. To be specific, a hand pose and mesh estimation model first predicts
parametric 3D hand annotations (i.e., 3D keypoint positions and vertices for
hand mesh) with real-world hand images in the training phase. Second, synthetic
hand images are generated with self-estimated hand mesh representations. After
that, the synthetic hand images are fed into the same model again. Thus, the
proposed learning strategy simultaneously improves quantitative results and
visual qualities by reinforcing synthetic mesh representation. To encourage
consistency between original model output and its recycled one, we propose
self-correlation loss which maximizes the accuracy and reliability of our
learning strategy. Consequently, the model effectively conducts self-refinement
on hand pose estimation by learning mesh representation from its own output. To
demonstrate the effectiveness of our learning strategy, we provide extensive
experiments on FreiHAND dataset. Notably, our learning strategy improves the
performance on hand pose and mesh estimation without any extra computational
burden during the inference.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Towards Robust 3D Pose Transfer with Adversarial Learning [36.351835328908116]
3D pose transfer that aims to transfer the desired pose to a target mesh is one of the most challenging 3D generation tasks.
Previous attempts rely on well-defined parametric human models or skeletal joints as driving pose sources.
We propose 3D-PoseMAE, a customized MAE that effectively learns 3D extrinsic presentations (i.e., pose)
arXiv Detail & Related papers (2024-04-02T19:03:39Z) - Denoising Diffusion for 3D Hand Pose Estimation from Images [38.20064386142944]
This paper addresses the problem of 3D hand pose estimation from monocular images or sequences.
We present a novel end-to-end framework for 3D hand regression that employs diffusion models that have shown excellent ability to capture the distribution of data for generative purposes.
The proposed model provides state-of-the-art performance when lifting a 2D single-hand image to 3D.
arXiv Detail & Related papers (2023-08-18T12:57:22Z) - HandMIM: Pose-Aware Self-Supervised Learning for 3D Hand Mesh Estimation [5.888156950854715]
We propose a novel self-supervised pre-training strategy for regressing 3D hand mesh parameters.
Our proposed approach, named HandMIM, achieves strong performance on various hand mesh estimation tasks.
arXiv Detail & Related papers (2023-07-29T19:46:06Z) - TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose
Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data.
We learn to predict realistic texture of objects from real image collections.
We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z) - 3D Hand Pose and Shape Estimation from RGB Images for Improved
Keypoint-Based Hand-Gesture Recognition [25.379923604213626]
This paper presents a keypoint-based end-to-end framework for the 3D hand and pose estimation.
It is successfully applied to the hand-gesture recognition task as a study case.
arXiv Detail & Related papers (2021-09-28T17:07:43Z) - Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction [57.3636347704271]
3D hand-mesh reconstruction from RGB images facilitates many applications, including augmented reality (AR)
This paper presents a novel pipeline by decoupling the hand-mesh reconstruction task into three stages.
We can promote high-quality finger-level mesh-image alignment and drive the models together to deliver real-time predictions.
arXiv Detail & Related papers (2021-09-03T20:42:01Z) - Self-Supervised 3D Hand Pose Estimation from monocular RGB via
Contrastive Learning [50.007445752513625]
We propose a new self-supervised method for the structured regression task of 3D hand pose estimation.
We experimentally investigate the impact of invariant and equivariant contrastive objectives.
We show that a standard ResNet-152, trained on additional unlabeled data, attains an improvement of $7.6%$ in PA-EPE on FreiHAND.
arXiv Detail & Related papers (2021-06-10T17:48:57Z) - SeqHAND:RGB-Sequence-Based 3D Hand Pose and Shape Estimation [48.456638103309544]
3D hand pose estimation based on RGB images has been studied for a long time.
We propose a novel method that generates a synthetic dataset that mimics natural human hand movements.
We show that utilizing temporal information for 3D hand pose estimation significantly enhances general pose estimations.
arXiv Detail & Related papers (2020-07-10T05:11:14Z) - Leveraging Photometric Consistency over Time for Sparsely Supervised
Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video.
Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.