Task-Oriented Human-Object Interactions Generation with Implicit Neural
Representations
- URL: http://arxiv.org/abs/2303.13129v2
- Date: Sat, 4 Nov 2023 03:47:12 GMT
- Title: Task-Oriented Human-Object Interactions Generation with Implicit Neural
Representations
- Authors: Quanzhou Li, Jingbo Wang, Chen Change Loy, Bo Dai
- Abstract summary: TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations.
Our method generates continuous motions that are parameterized only by the temporal coordinate.
This work takes a step further toward general human-scene interaction simulation.
- Score: 61.659439423703155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Digital human motion synthesis is a vibrant research field with applications
in movies, AR/VR, and video games. Whereas methods were proposed to generate
natural and realistic human motions, most only focus on modeling humans and
largely ignore object movements. Generating task-oriented human-object
interaction motions in simulation is challenging. For different intents of
using the objects, humans conduct various motions, which requires the human
first to approach the objects and then make them move consistently with the
human instead of staying still. Also, to deploy in downstream applications, the
synthesized motions are desired to be flexible in length, providing options to
personalize the predicted motions for various purposes. To this end, we propose
TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural
Representations, which generates full human-object interaction motions to
conduct specific tasks, given only the task type, the object, and a starting
human status. TOHO generates human-object motions in three steps: 1) it first
estimates the keyframe poses of conducting a task given the task type and
object information; 2) then, it infills the keyframes and generates continuous
motions; 3) finally, it applies a compact closed-form object motion estimation
to generate the object motion. Our method generates continuous motions that are
parameterized only by the temporal coordinate, which allows for upsampling or
downsampling of the sequence to arbitrary frames and adjusting the motion
speeds by designing the temporal coordinate vector. We demonstrate the
effectiveness of our method, both qualitatively and quantitatively. This work
takes a step further toward general human-scene interaction simulation.
Related papers
- Universal Humanoid Motion Representations for Physics-Based Control [71.46142106079292]
We present a universal motion representation that encompasses a comprehensive range of motor skills for physics-based humanoid control.
We first learn a motion imitator that can imitate all of human motion from a large, unstructured motion dataset.
We then create our motion representation by distilling skills directly from the imitator.
arXiv Detail & Related papers (2023-10-06T20:48:43Z) - Object Motion Guided Human Motion Synthesis [22.08240141115053]
We study the problem of full-body human motion synthesis for the manipulation of large-sized objects.
We propose Object MOtion guided human MOtion synthesis (OMOMO), a conditional diffusion framework.
We develop a novel system that captures full-body human manipulation motions by simply attaching a smartphone to the object being manipulated.
arXiv Detail & Related papers (2023-09-28T08:22:00Z) - GRIP: Generating Interaction Poses Using Spatial Cues and Latent Consistency [57.9920824261925]
Hands are dexterous and highly versatile manipulators that are central to how humans interact with objects and their environment.
modeling realistic hand-object interactions is critical for applications in computer graphics, computer vision, and mixed reality.
GRIP is a learning-based method that takes as input the 3D motion of the body and the object, and synthesizes realistic motion for both hands before, during, and after object interaction.
arXiv Detail & Related papers (2023-08-22T17:59:51Z) - NIFTY: Neural Object Interaction Fields for Guided Human Motion
Synthesis [21.650091018774972]
We create a neural interaction field attached to a specific object, which outputs the distance to the valid interaction manifold given a human pose as input.
This interaction field guides the sampling of an object-conditioned human motion diffusion model.
We synthesize realistic motions for sitting and lifting with several objects, outperforming alternative approaches in terms of motion quality and successful action completion.
arXiv Detail & Related papers (2023-07-14T17:59:38Z) - IMoS: Intent-Driven Full-Body Motion Synthesis for Human-Object
Interactions [69.95820880360345]
We present the first framework to synthesize the full-body motion of virtual human characters with 3D objects placed within their reach.
Our system takes as input textual instructions specifying the objects and the associated intentions of the virtual characters.
We show that our synthesized full-body motions appear more realistic to the participants in more than 80% of scenarios.
arXiv Detail & Related papers (2022-12-14T23:59:24Z) - SAGA: Stochastic Whole-Body Grasping with Contact [60.43627793243098]
Human grasping synthesis has numerous applications including AR/VR, video games, and robotics.
In this work, our goal is to synthesize whole-body grasping motion. Given a 3D object, we aim to generate diverse and natural whole-body human motions that approach and grasp the object.
arXiv Detail & Related papers (2021-12-19T10:15:30Z) - Task-Generic Hierarchical Human Motion Prior using VAEs [44.356707509079044]
A deep generative model that describes human motions can benefit a wide range of fundamental computer vision and graphics tasks.
We present a method for learning complex human motions independent of specific tasks using a combined global and local latent space.
We demonstrate the effectiveness of our hierarchical motion variational autoencoder in a variety of tasks including video-based human pose estimation.
arXiv Detail & Related papers (2021-06-07T23:11:42Z) - Scene-aware Generative Network for Human Motion Synthesis [125.21079898942347]
We propose a new framework, with the interaction between the scene and the human motion taken into account.
Considering the uncertainty of human motion, we formulate this task as a generative task.
We derive a GAN based learning approach, with discriminators to enforce the compatibility between the human motion and the contextual scene.
arXiv Detail & Related papers (2021-05-31T09:05:50Z) - Action2Motion: Conditioned Generation of 3D Human Motions [28.031644518303075]
We aim to generateplausible human motion sequences in 3D.
Each sampled sequence faithfully resembles anaturalhuman bodyarticulation dynamics.
A new 3D human motion dataset, HumanAct12, is also constructed.
arXiv Detail & Related papers (2020-07-30T05:29:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.