Related papers: GOAL: Generating 4D Whole-Body Motion for Hand-Object Grasping

GOAL: Generating 4D Whole-Body Motion for Hand-Object Grasping

URL: http://arxiv.org/abs/2112.11454v1
Date: Tue, 21 Dec 2021 18:59:34 GMT
Title: GOAL: Generating 4D Whole-Body Motion for Hand-Object Grasping
Authors: Omid Taheri, Vasileios Choutas, Michael J. Black, and Dimitrios Tzionas
Abstract summary: Existing methods focus on the major limbs of the body, ignoring the hands and head. Hands have been separately studied but the focus has been on generating realistic static grasps of objects. We need to generate full-body motions and realistic hand grasps simultaneously. For the first time, we address the problem of generating full-body, hand and head motions of an avatar grasping an unknown object.
Score: 47.49549115570664
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generating digital humans that move realistically has many applications and is widely studied, but existing methods focus on the major limbs of the body, ignoring the hands and head. Hands have been separately studied but the focus has been on generating realistic static grasps of objects. To synthesize virtual characters that interact with the world, we need to generate full-body motions and realistic hand grasps simultaneously. Both sub-problems are challenging on their own and, together, the state-space of poses is significantly larger, the scales of hand and body motions differ, and the whole-body posture and the hand grasp must agree, satisfy physical constraints, and be plausible. Additionally, the head is involved because the avatar must look at the object to interact with it. For the first time, we address the problem of generating full-body, hand and head motions of an avatar grasping an unknown object. As input, our method, called GOAL, takes a 3D object, its position, and a starting 3D body pose and shape. GOAL outputs a sequence of whole-body poses using two novel networks. First, GNet generates a goal whole-body grasp with a realistic body, head, arm, and hand pose, as well as hand-object contact. Second, MNet generates the motion between the starting and goal pose. This is challenging, as it requires the avatar to walk towards the object with foot-ground contact, orient the head towards it, reach out, and grasp it with a realistic hand pose and hand-object contact. To achieve this the networks exploit a representation that combines SMPL-X body parameters and 3D vertex offsets. We train and evaluate GOAL, both qualitatively and quantitatively, on the GRAB dataset. Results show that GOAL generalizes well to unseen objects, outperforming baselines. GOAL takes a step towards synthesizing realistic full-body object grasping.

Related papers

GraspDiffusion: Synthesizing Realistic Whole-body Hand-Object Interaction [9.564223516111275]
Recent generative models can synthesize high-quality images but often fail to generate humans interacting with objects using their hands. In this paper, we propose GraspDiffusion, a novel generative method that creates realistic scenes of human-object interaction.
arXiv Detail & Related papers (2024-10-17T01:45:42Z)
Target Pose Guided Whole-body Grasping Motion Generation for Digital Humans [8.741075482543991]
We propose a grasping motion generation framework for digital human. We first generate a target pose for whole-body digital human based on off-the-shelf target grasping pose generation methods. With an initial pose and this generated target pose, a transformer-based neural network is used to generate the whole grasping trajectory.
arXiv Detail & Related papers (2024-09-26T05:43:23Z)
Full-Body Articulated Human-Object Interaction [61.01135739641217]
CHAIRS is a large-scale motion-captured f-AHOI dataset consisting of 16.2 hours of versatile interactions. CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process. By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation.
arXiv Detail & Related papers (2022-12-20T19:50:54Z)
Generating Holistic 3D Human Motion from Speech [97.11392166257791]
We build a high-quality dataset of 3D holistic body meshes with synchronous speech. We then define a novel speech-to-motion generation framework in which the face, body, and hands are modeled separately.
arXiv Detail & Related papers (2022-12-08T17:25:19Z)
FLEX: Full-Body Grasping Without Full-Body Grasps [24.10724524386518]
We address the task of generating a virtual human -- hands and full body -- grasping everyday objects. Existing methods approach this problem by collecting a 3D dataset of humans interacting with objects and training on this data. We leverage the existence of both full-body pose and hand grasping priors, composing them using 3D geometrical constraints to obtain full-body grasps.
arXiv Detail & Related papers (2022-11-21T23:12:54Z)
Embodied Hands: Modeling and Capturing Hands and Bodies Together [61.32931890166915]
Humans move their hands and bodies together to communicate and solve tasks. Most methods treat the 3D modeling and tracking of bodies and hands separately. We formulate a model of hands and bodies interacting together and fit it to full-body 4D sequences.
arXiv Detail & Related papers (2022-01-07T18:59:32Z)
SAGA: Stochastic Whole-Body Grasping with Contact [60.43627793243098]
Human grasping synthesis has numerous applications including AR/VR, video games, and robotics. In this work, our goal is to synthesize whole-body grasping motion. Given a 3D object, we aim to generate diverse and natural whole-body human motions that approach and grasp the object.
arXiv Detail & Related papers (2021-12-19T10:15:30Z)
GRAB: A Dataset of Whole-Body Human Grasping of Objects [53.00728704389501]
Training computers to understand human grasping requires a rich dataset containing complex 3D object shapes, detailed contact information, hand pose and shape, and the 3D body motion over time. We collect a new dataset, called GRAB, of whole-body grasps, containing full 3D shape and pose sequences of 10 subjects interacting with 51 everyday objects of varying shape and size. This is a unique dataset, that goes well beyond existing ones for modeling and understanding how humans grasp and manipulate objects, how their full body is involved, and how interaction varies with the task.
arXiv Detail & Related papers (2020-08-25T17:57:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.