COUCH: Towards Controllable Human-Chair Interactions
- URL: http://arxiv.org/abs/2205.00541v1
- Date: Sun, 1 May 2022 19:14:22 GMT
- Title: COUCH: Towards Controllable Human-Chair Interactions
- Authors: Xiaohan Zhang, Bharat Lal Bhatnagar, Vladimir Guzov, Sebastian Starke,
Gerard Pons-Moll
- Abstract summary: We study the problem of synthesizing scene interactions conditioned on different contact positions on the object.
We propose a novel synthesis framework COUCH that plans ahead the motion by predicting contact-aware control signals of the hands.
Our method shows significant quantitative and qualitative improvements over existing methods for human-object interactions.
- Score: 44.66450508317131
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans interact with an object in many different ways by making contact at
different locations, creating a highly complex motion space that can be
difficult to learn, particularly when synthesizing such human interactions in a
controllable manner. Existing works on synthesizing human scene interaction
focus on the high-level control of action but do not consider the fine-grained
control of motion. In this work, we study the problem of synthesizing scene
interactions conditioned on different contact positions on the object. As a
testbed to investigate this new problem, we focus on human-chair interaction as
one of the most common actions which exhibit large variability in terms of
contacts. We propose a novel synthesis framework COUCH that plans ahead the
motion by predicting contact-aware control signals of the hands, which are then
used to synthesize contact-conditioned interactions. Furthermore, we contribute
a large human-chair interaction dataset with clean annotations, the COUCH
Dataset. Our method shows significant quantitative and qualitative improvements
over existing methods for human-object interactions. More importantly, our
method enables control of the motion through user-specified or automatically
predicted contacts.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - Two-Person Interaction Augmentation with Skeleton Priors [16.65884142618145]
We propose a new deep learning method for two-body skeletal interaction motion augmentation.
Our system can learn effectively from a relatively small amount of data and generalize to drastically different skeleton sizes.
arXiv Detail & Related papers (2024-04-08T13:11:57Z) - Controllable Human-Object Interaction Synthesis [77.56877961681462]
We propose Controllable Human-Object Interaction Synthesis (CHOIS) to generate synchronized object motion and human motion in 3D scenes.
Here, language descriptions inform style and intent, and waypoints, which can be effectively extracted from high-level planning, ground the motion in the scene.
Our module seamlessly integrates with a path planning module, enabling the generation of long-term interactions in 3D environments.
arXiv Detail & Related papers (2023-12-06T21:14:20Z) - ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions [66.87211993793807]
We present ReMoS, a denoising diffusion based model that synthesizes full body motion of a person in two person interaction scenario.
We demonstrate ReMoS across challenging two person scenarios such as pair dancing, Ninjutsu, kickboxing, and acrobatics.
We also contribute the ReMoCap dataset for two person interactions containing full body and finger motions.
arXiv Detail & Related papers (2023-11-28T18:59:52Z) - InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint [67.6297384588837]
We introduce a novel controllable motion generation method, InterControl, to encourage the synthesized motions maintaining the desired distance between joint pairs.
We demonstrate that the distance between joint pairs for human-wise interactions can be generated using an off-the-shelf Large Language Model.
arXiv Detail & Related papers (2023-11-27T14:32:33Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - NIFTY: Neural Object Interaction Fields for Guided Human Motion
Synthesis [21.650091018774972]
We create a neural interaction field attached to a specific object, which outputs the distance to the valid interaction manifold given a human pose as input.
This interaction field guides the sampling of an object-conditioned human motion diffusion model.
We synthesize realistic motions for sitting and lifting with several objects, outperforming alternative approaches in terms of motion quality and successful action completion.
arXiv Detail & Related papers (2023-07-14T17:59:38Z) - Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in
Complex 3D Environments [11.87902527509297]
We present LAMA, Locomotion-Action-MAnipulation, to synthesize natural and plausible long-term human movements in complex indoor environments.
Unlike existing methods that require motion data "paired" with scanned 3D scenes for supervision, we formulate the problem as a test-time optimization by using human motion capture data only for synthesis.
arXiv Detail & Related papers (2023-01-09T18:59:16Z) - Compositional Human-Scene Interaction Synthesis with Semantic Control [16.93177243590465]
We aim to synthesize humans interacting with a given 3D scene controlled by high-level semantic specifications.
We design a novel transformer-based generative model, in which the articulated 3D human body surface points and 3D objects are jointly encoded.
Inspired by the compositional nature of interactions that humans can simultaneously interact with multiple objects, we define interaction semantics as the composition of varying numbers of atomic action-object pairs.
arXiv Detail & Related papers (2022-07-26T11:37:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.