Related papers: Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion

Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion

URL: http://arxiv.org/abs/2205.02830v4
Date: Mon, 18 Mar 2024 07:11:53 GMT
Title: Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion
Authors: Vladimir Guzov, Julian Chibane, Riccardo Marin, Yannan He, Yunus Saracoglu, Torsten Sattler, Gerard Pons-Moll,
Abstract summary: Modeling changes caused by humans is essential for building digital twins. Our method combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Our code, data and model are available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/.
Score: 48.982957332374866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (metaverses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans. Our code, data and model are available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/

Related papers

INTENTION: Inferring Tendencies of Humanoid Robot Motion Through Interactive Intuition and Grounded VLM [9.217332197059001]
In this work, we propose INTENTION, a novel framework enabling robots with learned interactive intuition and autonomous manipulation in diverse scenarios.<n>We introduce Memory Graph to record scenes from previous task interactions which embodies human-like understanding and decision-making about different tasks in real world.<n>Meanwhile, we design an Intuitive Perceptor that extracts physical relations and affordances from visual scenes.
arXiv Detail & Related papers (2025-08-06T23:27:22Z)
Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis [51.95817740348585]
Human-X is a novel framework designed to enable immersive and physically plausible human interactions across diverse entities.<n>Our method jointly predicts actions and reactions in real-time using an auto-regressive reaction diffusion planner.<n>Our framework is validated in real-world applications, including virtual reality interface for human-robot interaction.
arXiv Detail & Related papers (2025-08-04T06:35:48Z)
ZeroHSI: Zero-Shot 4D Human-Scene Interaction by Video Generation [17.438484695828276]
We present ZeroHSI, a novel approach that enables zero-shot 4D human-scene interaction synthesis. Our key insight is to distill human-scene interactions from state-of-the-art video generation models. ZeroHSI can synthesize realistic human motions in both static scenes and environments with dynamic objects.
arXiv Detail & Related papers (2024-12-24T18:55:38Z)
Robot Interaction Behavior Generation based on Social Motion Forecasting for Human-Robot Interaction [9.806227900768926]
We propose to model social motion forecasting in a shared human-robot representation space. ECHO operates in the aforementioned shared space to predict the future motions of the agents encountered in social scenarios. We evaluate our model in multi-person and human-robot motion forecasting tasks and obtain state-of-the-art performance by a large margin.
arXiv Detail & Related papers (2024-02-07T11:37:14Z)
Revisit Human-Scene Interaction via Space Occupancy [55.67657438543008]
Human-scene Interaction (HSI) generation is a challenging task and crucial for various downstream tasks. In this work, we argue that interaction with a scene is essentially interacting with the space occupancy of the scene from an abstract physical perspective. By treating pure motion sequences as records of humans interacting with invisible scene occupancy, we can aggregate motion-only data into a large-scale paired human-occupancy interaction database.
arXiv Detail & Related papers (2023-12-05T12:03:00Z)
InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint [67.6297384588837]
We introduce a novel controllable motion generation method, InterControl, to encourage the synthesized motions maintaining the desired distance between joint pairs. We demonstrate that the distance between joint pairs for human-wise interactions can be generated using an off-the-shelf Large Language Model.
arXiv Detail & Related papers (2023-11-27T14:32:33Z)
NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis [21.650091018774972]
We create a neural interaction field attached to a specific object, which outputs the distance to the valid interaction manifold given a human pose as input. This interaction field guides the sampling of an object-conditioned human motion diffusion model. We synthesize realistic motions for sitting and lifting with several objects, outperforming alternative approaches in terms of motion quality and successful action completion.
arXiv Detail & Related papers (2023-07-14T17:59:38Z)
Compositional 3D Human-Object Neural Animation [93.38239238988719]
Human-object interactions (HOIs) are crucial for human-centric scene understanding applications such as human-centric visual generation, AR/VR, and robotics. In this paper, we address this challenge in HOI animation from a compositional perspective. We adopt neural human-object deformation to model and render HOI dynamics based on implicit neural representations.
arXiv Detail & Related papers (2023-04-27T10:04:56Z)
BEHAVE: Dataset and Method for Tracking Human Object Interactions [105.77368488612704]
We present the first full body human- object interaction dataset with multi-view RGBD frames and corresponding 3D SMPL and object fits along with the annotated contacts between them. We use this data to learn a model that can jointly track humans and objects in natural environments with an easy-to-use portable multi-camera setup.
arXiv Detail & Related papers (2022-04-14T13:21:19Z)
Stochastic Scene-Aware Motion Prediction [41.6104600038666]
We present a novel data-driven, synthesis motion method that models different styles of performing a given action with a target object. Our method, called SAMP, for SceneAware Motion Prediction, generalizes to target objects of various geometries while enabling the character to navigate in cluttered scenes.
arXiv Detail & Related papers (2021-08-18T17:56:17Z)
Visual Navigation Among Humans with Optimal Control as a Supervisor [72.5188978268463]
We propose an approach that combines learning-based perception with model-based optimal control to navigate among humans. Our approach is enabled by our novel data-generation tool, HumANav. We demonstrate that the learned navigation policies can anticipate and react to humans without explicitly predicting future human motion.
arXiv Detail & Related papers (2020-03-20T16:13:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.