HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed
Distance Fields
- URL: http://arxiv.org/abs/2402.17062v1
- Date: Mon, 26 Feb 2024 22:48:37 GMT
- Title: HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed
Distance Fields
- Authors: Haozhe Qi, Chen Zhao, Mathieu Salzmann, Alexander Mathis
- Abstract summary: HOISDF is a guided hand-object pose estimation network.
It exploits hand and object SDFs to provide a global, implicit representation over the complete reconstruction volume.
We show that HOISDF achieves state-of-the-art results on hand-object pose estimation benchmarks.
- Score: 96.04424738803667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human hands are highly articulated and versatile at handling objects. Jointly
estimating the 3D poses of a hand and the object it manipulates from a
monocular camera is challenging due to frequent occlusions. Thus, existing
methods often rely on intermediate 3D shape representations to increase
performance. These representations are typically explicit, such as 3D point
clouds or meshes, and thus provide information in the direct surroundings of
the intermediate hand pose estimate. To address this, we introduce HOISDF, a
Signed Distance Field (SDF) guided hand-object pose estimation network, which
jointly exploits hand and object SDFs to provide a global, implicit
representation over the complete reconstruction volume. Specifically, the role
of the SDFs is threefold: equip the visual encoder with implicit shape
information, help to encode hand-object interactions, and guide the hand and
object pose regression via SDF-based sampling and by augmenting the feature
representations. We show that HOISDF achieves state-of-the-art results on
hand-object pose estimation benchmarks (DexYCB and HO3Dv2). Code is available
at https://github.com/amathislab/HOISDF
Related papers
- SHOWMe: Benchmarking Object-agnostic Hand-Object 3D Reconstruction [13.417086460511696]
We introduce the SHOWMe dataset which consists of 96 videos, annotated with real and detailed hand-object 3D textured meshes.
We consider a rigid hand-object scenario, in which the pose of the hand with respect to the object remains constant during the whole video sequence.
This assumption allows us to register sub-millimetre-precise groundtruth 3D scans to the image sequences in SHOWMe.
arXiv Detail & Related papers (2023-09-19T16:48:29Z) - DDF-HO: Hand-Held Object Reconstruction via Conditional Directed
Distance Field [82.81337273685176]
DDF-HO is a novel approach leveraging Directed Distance Field (DDF) as the shape representation.
We randomly sample multiple rays and collect local to global geometric features for them by introducing a novel 2D ray-based feature aggregation scheme.
Experiments on synthetic and real-world datasets demonstrate that DDF-HO consistently outperforms all baseline methods by a large margin.
arXiv Detail & Related papers (2023-08-16T09:06:32Z) - gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object
Reconstruction [94.46581592405066]
We exploit the hand structure and use it as guidance for SDF-based shape reconstruction.
We predict kinematic chains of pose transformations and align SDFs with highly-articulated hand poses.
arXiv Detail & Related papers (2023-04-24T10:05:48Z) - What's in your hands? 3D Reconstruction of Generic Objects in Hands [49.12461675219253]
Our work aims to reconstruct hand-held objects given a single RGB image.
In contrast to prior works that typically assume known 3D templates and reduce the problem to 3D pose estimation, our work reconstructs generic hand-held object without knowing their 3D templates.
arXiv Detail & Related papers (2022-04-14T17:59:02Z) - Joint Hand-object 3D Reconstruction from a Single Image with
Cross-branch Feature Fusion [78.98074380040838]
We propose to consider hand and object jointly in feature space and explore the reciprocity of the two branches.
We employ an auxiliary depth estimation module to augment the input RGB image with the estimated depth map.
Our approach significantly outperforms existing approaches in terms of the reconstruction accuracy of objects.
arXiv Detail & Related papers (2020-06-28T09:50:25Z) - Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and
Objects for 3D Hand Pose Estimation under Hand-Object Interaction [137.28465645405655]
HANDS'19 is a challenge to evaluate the abilities of current 3D hand pose estimators (HPEs) to interpolate and extrapolate the poses of a training set.
We show that the accuracy of state-of-the-art methods can drop, and that they fail mostly on poses absent from the training set.
arXiv Detail & Related papers (2020-03-30T19:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.