A Skeleton-Driven Neural Occupancy Representation for Articulated Hands
- URL: http://arxiv.org/abs/2109.11399v1
- Date: Thu, 23 Sep 2021 14:35:19 GMT
- Title: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands
- Authors: Korrawe Karunratanakul, Adrian Spurr, Zicong Fan, Otmar Hilliges, Siyu
Tang
- Abstract summary: Hand ArticuLated Occupancy (HALO) is a novel representation of articulated hands that bridges the advantages of 3D keypoints and neural implicit surfaces.
We demonstrate the applicability of HALO to the task of conditional generation of hands that grasp 3D objects.
- Score: 49.956892429789775
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present Hand ArticuLated Occupancy (HALO), a novel representation of
articulated hands that bridges the advantages of 3D keypoints and neural
implicit surfaces and can be used in end-to-end trainable architectures. Unlike
existing statistical parametric hand models (e.g.~MANO), HALO directly
leverages 3D joint skeleton as input and produces a neural occupancy volume
representing the posed hand surface. The key benefits of HALO are (1) it is
driven by 3D key points, which have benefits in terms of accuracy and are
easier to learn for neural networks than the latent hand-model parameters; (2)
it provides a differentiable volumetric occupancy representation of the posed
hand; (3) it can be trained end-to-end, allowing the formulation of losses on
the hand surface that benefit the learning of 3D keypoints. We demonstrate the
applicability of HALO to the task of conditional generation of hands that grasp
3D objects. The differentiable nature of HALO is shown to improve the quality
of the synthesized hands both in terms of physical plausibility and user
preference.
Related papers
- Two Hands Are Better Than One: Resolving Hand to Hand Intersections via Occupancy Networks [33.9893684177763]
Self-occlusions and finger articulation pose a significant problem to estimation.
We exploit an occupancy network that represents the hand's volume as a continuous manifold.
We design an intersection loss function to minimize the likelihood of hand-to-point intersections.
arXiv Detail & Related papers (2024-04-08T11:32:26Z) - HMP: Hand Motion Priors for Pose and Shape Estimation from Video [52.39020275278984]
We develop a generative motion prior specific for hands, trained on the AMASS dataset which features diverse and high-quality hand motions.
Our integration of a robust motion prior significantly enhances performance, especially in occluded scenarios.
We demonstrate our method's efficacy via qualitative and quantitative evaluations on the HO3D and DexYCB datasets.
arXiv Detail & Related papers (2023-12-27T22:35:33Z) - LG-Hand: Advancing 3D Hand Pose Estimation with Locally and Globally
Kinematic Knowledge [0.693939291118954]
We propose LG-Hand, a powerful method for 3D hand pose estimation.
We argue that kinematic information plays an important role, contributing to the performance of 3D hand pose estimation.
Our method achieves promising results on the First-Person Hand Action Benchmark dataset.
arXiv Detail & Related papers (2022-11-06T15:26:32Z) - NIMBLE: A Non-rigid Hand Model with Bones and Muscles [41.19718491215149]
We present NIMBLE, a novel parametric hand model that includes the missing key components.
NIMBLE consists of 20 bones as triangular meshes, 7 muscle groups as tetrahedral meshes, and a skin mesh.
We demonstrate applying NIMBLE to modeling, rendering, and visual inference tasks.
arXiv Detail & Related papers (2022-02-09T15:57:21Z) - Monocular 3D Reconstruction of Interacting Hands via Collision-Aware
Factorized Refinements [96.40125818594952]
We make the first attempt to reconstruct 3D interacting hands from monocular single RGB images.
Our method can generate 3D hand meshes with both precise 3D poses and minimal collisions.
arXiv Detail & Related papers (2021-11-01T08:24:10Z) - HandFoldingNet: A 3D Hand Pose Estimation Network Using
Multiscale-Feature Guided Folding of a 2D Hand Skeleton [4.1954750695245835]
This paper proposes HandFoldingNet, an accurate and efficient hand pose estimator.
The proposed model utilizes a folding-based decoder that folds a given 2D hand skeleton into the corresponding joint coordinates.
Experimental results show that the proposed model outperforms the existing methods on three hand pose benchmark datasets.
arXiv Detail & Related papers (2021-08-12T05:52:44Z) - HandVoxNet++: 3D Hand Shape and Pose Estimation using Voxel-Based Neural
Networks [71.09275975580009]
HandVoxNet++ is a voxel-based deep network with 3D and graph convolutions trained in a fully supervised manner.
HandVoxNet++ relies on two hand shape representations. The first one is the 3D voxelized grid of hand shape, which does not preserve the mesh topology.
We combine the advantages of both representations by aligning the hand surface to the voxelized hand shape either with a new neural Graph-Convolutions-based Mesh Registration (GCN-MeshReg) or classical segment-wise Non-Rigid Gravitational Approach (NRGA++) which
arXiv Detail & Related papers (2021-07-02T17:59:54Z) - Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation [70.23652933572647]
Whole-body 3D human mesh estimation aims to reconstruct the 3D human body, hands, and face simultaneously.
We present Hand4Whole, which has two strong points over previous works.
Our Hand4Whole is trained in an end-to-end manner and produces much better 3D hand results than previous whole-body 3D human mesh estimation methods.
arXiv Detail & Related papers (2020-11-23T16:48:35Z) - MM-Hand: 3D-Aware Multi-Modal Guided Hand Generative Network for 3D Hand
Pose Synthesis [81.40640219844197]
Estimating the 3D hand pose from a monocular RGB image is important but challenging.
A solution is training on large-scale RGB hand images with accurate 3D hand keypoint annotations.
We have developed a learning-based approach to synthesize realistic, diverse, and 3D pose-preserving hand images.
arXiv Detail & Related papers (2020-10-02T18:27:34Z) - Grasping Field: Learning Implicit Representations for Human Grasps [16.841780141055505]
We propose an expressive representation for human grasp modelling that is efficient and easy to integrate with deep neural networks.
We name this 3D to 2D mapping as Grasping Field, parameterize it with a deep neural network, and learn it from data.
Our generative model is able to synthesize high-quality human grasps, given only on a 3D object point cloud.
arXiv Detail & Related papers (2020-08-10T23:08:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.