InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose
Estimation from a Single RGB Image
- URL: http://arxiv.org/abs/2008.09309v1
- Date: Fri, 21 Aug 2020 05:15:58 GMT
- Title: InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose
Estimation from a Single RGB Image
- Authors: Gyeongsik Moon, Shoou-i Yu, He Wen, Takaaki Shiratori, Kyoung Mu Lee
- Abstract summary: We propose a large-scale dataset, InterHand2.6M, and a network, InterNet, for 3D interacting hand pose estimation from a single RGB image.
In our experiments, we demonstrate big gains in 3D interacting hand pose estimation accuracy when leveraging the interacting hand data in InterHand2.6M.
We also report the accuracy of InterNet on InterHand2.6M, which serves as a strong baseline for this new dataset.
- Score: 71.17227941339935
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analysis of hand-hand interactions is a crucial step towards better
understanding human behavior. However, most researches in 3D hand pose
estimation have focused on the isolated single hand case. Therefore, we firstly
propose (1) a large-scale dataset, InterHand2.6M, and (2) a baseline network,
InterNet, for 3D interacting hand pose estimation from a single RGB image. The
proposed InterHand2.6M consists of \textbf{2.6M labeled single and interacting
hand frames} under various poses from multiple subjects. Our InterNet
simultaneously performs 3D single and interacting hand pose estimation. In our
experiments, we demonstrate big gains in 3D interacting hand pose estimation
accuracy when leveraging the interacting hand data in InterHand2.6M. We also
report the accuracy of InterNet on InterHand2.6M, which serves as a strong
baseline for this new dataset. Finally, we show 3D interacting hand pose
estimation results from general images. Our code and dataset are available at
https://mks0601.github.io/InterHand2.6M/.
Related papers
- A Dataset of Relighted 3D Interacting Hands [37.31717123107306]
Re:InterHand is a dataset of relighted 3D interacting hands.
We employ a state-of-the-art hand relighting network with our accurately tracked two-hand 3D poses.
arXiv Detail & Related papers (2023-10-26T20:26:50Z) - 3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal [85.30756038989057]
Estimating 3D interacting hand pose from a single RGB image is essential for understanding human actions.
We propose to decompose the challenging interacting hand pose estimation task and estimate the pose of each hand separately.
Experiments show that the proposed method significantly outperforms previous state-of-the-art interacting hand pose estimation approaches.
arXiv Detail & Related papers (2022-07-22T13:04:06Z) - Monocular 3D Reconstruction of Interacting Hands via Collision-Aware
Factorized Refinements [96.40125818594952]
We make the first attempt to reconstruct 3D interacting hands from monocular single RGB images.
Our method can generate 3D hand meshes with both precise 3D poses and minimal collisions.
arXiv Detail & Related papers (2021-11-01T08:24:10Z) - Learning to Disambiguate Strongly Interacting Hands via Probabilistic
Per-pixel Part Segmentation [84.28064034301445]
Self-similarity, and the resulting ambiguities in assigning pixel observations to the respective hands, is a major cause of the final 3D pose error.
We propose DIGIT, a novel method for estimating the 3D poses of two interacting hands from a single monocular image.
We experimentally show that the proposed approach achieves new state-of-the-art performance on the InterHand2.6M dataset.
arXiv Detail & Related papers (2021-07-01T13:28:02Z) - RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB
Video [76.86512780916827]
We present the first real-time method for motion capture of skeletal pose and 3D surface geometry of hands from a single RGB camera.
In order to address the inherent depth ambiguities in RGB data, we propose a novel multi-task CNN.
We experimentally verify the individual components of our RGB two-hand tracking and 3D reconstruction pipeline.
arXiv Detail & Related papers (2021-06-22T12:53:56Z) - H2O: Two Hands Manipulating Objects for First Person Interaction
Recognition [70.46638409156772]
We present a comprehensive framework for egocentric interaction recognition using markerless 3D annotations of two hands manipulating objects.
Our method produces annotations of the 3D pose of two hands and the 6D pose of the manipulated objects, along with their interaction labels for each frame.
Our dataset, called H2O (2 Hands and Objects), provides synchronized multi-view RGB-D images, interaction labels, object classes, ground-truth 3D poses for left & right hands, 6D object poses, ground-truth camera poses, object meshes and scene point clouds.
arXiv Detail & Related papers (2021-04-22T17:10:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.