AffordPose: A Large-scale Dataset of Hand-Object Interactions with
Affordance-driven Hand Pose
- URL: http://arxiv.org/abs/2309.08942v1
- Date: Sat, 16 Sep 2023 10:25:28 GMT
- Title: AffordPose: A Large-scale Dataset of Hand-Object Interactions with
Affordance-driven Hand Pose
- Authors: Juntao Jian, Xiuping Liu, Manyi Li, Ruizhen Hu, Jian Liu
- Abstract summary: We present AffordPose, a large-scale dataset of hand-object interactions with affordance-driven hand pose.
We collect a total of 26.7K hand-object interactions, each including the 3D object shape, the part-level affordance label, and the manually adjusted hand poses.
The comprehensive data analysis shows the common characteristics and diversity of hand-object interactions per affordance.
- Score: 16.65196181081623
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How human interact with objects depends on the functional roles of the target
objects, which introduces the problem of affordance-aware hand-object
interaction. It requires a large number of human demonstrations for the
learning and understanding of plausible and appropriate hand-object
interactions. In this work, we present AffordPose, a large-scale dataset of
hand-object interactions with affordance-driven hand pose. We first annotate
the specific part-level affordance labels for each object, e.g. twist, pull,
handle-grasp, etc, instead of the general intents such as use or handover, to
indicate the purpose and guide the localization of the hand-object
interactions. The fine-grained hand-object interactions reveal the influence of
hand-centered affordances on the detailed arrangement of the hand poses, yet
also exhibit a certain degree of diversity. We collect a total of 26.7K
hand-object interactions, each including the 3D object shape, the part-level
affordance label, and the manually adjusted hand poses. The comprehensive data
analysis shows the common characteristics and diversity of hand-object
interactions per affordance via the parameter statistics and contacting
computation. We also conduct experiments on the tasks of hand-object affordance
understanding and affordance-oriented hand-object interaction generation, to
validate the effectiveness of our dataset in learning the fine-grained
hand-object interactions. Project page:
https://github.com/GentlesJan/AffordPose.
Related papers
- DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions [15.417836855005087]
We propose DiffH2O, a novel method to synthesize realistic, one or two-handed object interactions.
We decompose the task into a grasping stage and a text-based interaction stage.
In the grasping stage, the model only generates hand motions, whereas in the interaction phase both hand and object poses are synthesized.
arXiv Detail & Related papers (2024-03-26T16:06:42Z) - InterTracker: Discovering and Tracking General Objects Interacting with
Hands in the Wild [40.489171608114574]
Existing methods rely on frame-based detectors to locate interacting objects.
We propose to leverage hand-object interaction to track interactive objects.
Our proposed method outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2023-08-06T09:09:17Z) - HMDO: Markerless Multi-view Hand Manipulation Capture with Deformable
Objects [8.711239906965893]
HMDO is the first markerless deformable interaction dataset recording interactive motions of the hands and deformable objects.
The proposed method can reconstruct interactive motions of hands and deformable objects with high quality.
arXiv Detail & Related papers (2023-01-18T16:55:15Z) - Interacting Hand-Object Pose Estimation via Dense Mutual Attention [97.26400229871888]
3D hand-object pose estimation is the key to the success of many computer vision applications.
We propose a novel dense mutual attention mechanism that is able to model fine-grained dependencies between the hand and the object.
Our method is able to produce physically plausible poses with high quality and real-time inference speed.
arXiv Detail & Related papers (2022-11-16T10:01:33Z) - Learning to Disambiguate Strongly Interacting Hands via Probabilistic
Per-pixel Part Segmentation [84.28064034301445]
Self-similarity, and the resulting ambiguities in assigning pixel observations to the respective hands, is a major cause of the final 3D pose error.
We propose DIGIT, a novel method for estimating the 3D poses of two interacting hands from a single monocular image.
We experimentally show that the proposed approach achieves new state-of-the-art performance on the InterHand2.6M dataset.
arXiv Detail & Related papers (2021-07-01T13:28:02Z) - H2O: Two Hands Manipulating Objects for First Person Interaction
Recognition [70.46638409156772]
We present a comprehensive framework for egocentric interaction recognition using markerless 3D annotations of two hands manipulating objects.
Our method produces annotations of the 3D pose of two hands and the 6D pose of the manipulated objects, along with their interaction labels for each frame.
Our dataset, called H2O (2 Hands and Objects), provides synchronized multi-view RGB-D images, interaction labels, object classes, ground-truth 3D poses for left & right hands, 6D object poses, ground-truth camera poses, object meshes and scene point clouds.
arXiv Detail & Related papers (2021-04-22T17:10:42Z) - InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose
Estimation from a Single RGB Image [71.17227941339935]
We propose a large-scale dataset, InterHand2.6M, and a network, InterNet, for 3D interacting hand pose estimation from a single RGB image.
In our experiments, we demonstrate big gains in 3D interacting hand pose estimation accuracy when leveraging the interacting hand data in InterHand2.6M.
We also report the accuracy of InterNet on InterHand2.6M, which serves as a strong baseline for this new dataset.
arXiv Detail & Related papers (2020-08-21T05:15:58Z) - Joint Hand-object 3D Reconstruction from a Single Image with
Cross-branch Feature Fusion [78.98074380040838]
We propose to consider hand and object jointly in feature space and explore the reciprocity of the two branches.
We employ an auxiliary depth estimation module to augment the input RGB image with the estimated depth map.
Our approach significantly outperforms existing approaches in terms of the reconstruction accuracy of objects.
arXiv Detail & Related papers (2020-06-28T09:50:25Z) - Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and
Objects for 3D Hand Pose Estimation under Hand-Object Interaction [137.28465645405655]
HANDS'19 is a challenge to evaluate the abilities of current 3D hand pose estimators (HPEs) to interpolate and extrapolate the poses of a training set.
We show that the accuracy of state-of-the-art methods can drop, and that they fail mostly on poses absent from the training set.
arXiv Detail & Related papers (2020-03-30T19:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.