EgoBody: Human Body Shape, Motion and Social Interactions from
Head-Mounted Devices
- URL: http://arxiv.org/abs/2112.07642v1
- Date: Tue, 14 Dec 2021 18:41:28 GMT
- Title: EgoBody: Human Body Shape, Motion and Social Interactions from
Head-Mounted Devices
- Authors: Siwei Zhang, Qianli Ma, Yan Zhang, Zhiyin Qian, Marc Pollefeys,
Federica Bogo, Siyu Tang
- Abstract summary: EgoBody is a novel large-scale dataset for social interactions in complex 3D scenes.
We employ Microsoft HoloLens2 headsets to record rich egocentric data streams including RGB, depth, eye gaze, head and hand tracking.
To obtain accurate 3D ground-truth, we calibrate the headset with a multi-Kinect rig and fit expressive SMPL-X body meshes to multi-view RGB-D frames.
- Score: 76.50816193153098
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding social interactions from first-person views is crucial for many
applications, ranging from assistive robotics to AR/VR. A first step for
reasoning about interactions is to understand human pose and shape. However,
research in this area is currently hindered by the lack of data. Existing
datasets are limited in terms of either size, annotations, ground-truth capture
modalities or the diversity of interactions. We address this shortcoming by
proposing EgoBody, a novel large-scale dataset for social interactions in
complex 3D scenes. We employ Microsoft HoloLens2 headsets to record rich
egocentric data streams (including RGB, depth, eye gaze, head and hand
tracking). To obtain accurate 3D ground-truth, we calibrate the headset with a
multi-Kinect rig and fit expressive SMPL-X body meshes to multi-view RGB-D
frames, reconstructing 3D human poses and shapes relative to the scene. We
collect 68 sequences, spanning diverse sociological interaction categories, and
propose the first benchmark for 3D full-body pose and shape estimation from
egocentric views. Our dataset and code will be available for research at
https://sanweiliti.github.io/egobody/egobody.html.
Related papers
- 3D Human Pose Perception from Egocentric Stereo Videos [67.9563319914377]
We propose a new transformer-based framework to improve egocentric stereo 3D human pose estimation.
Our method is able to accurately estimate human poses even in challenging scenarios, such as crouching and sitting.
We will release UnrealEgo2, UnrealEgo-RW, and trained models on our project page.
arXiv Detail & Related papers (2023-12-30T21:21:54Z) - DECO: Dense Estimation of 3D Human-Scene Contact In The Wild [54.44345845842109]
We train a novel 3D contact detector that uses both body-part-driven and scene-context-driven attention to estimate contact on the SMPL body.
We significantly outperform existing SOTA methods across all benchmarks.
We also show qualitatively that DECO generalizes well to diverse and challenging real-world human interactions in natural images.
arXiv Detail & Related papers (2023-09-26T21:21:07Z) - Full-Body Articulated Human-Object Interaction [61.01135739641217]
CHAIRS is a large-scale motion-captured f-AHOI dataset consisting of 16.2 hours of versatile interactions.
CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process.
By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation.
arXiv Detail & Related papers (2022-12-20T19:50:54Z) - Ego-Body Pose Estimation via Ego-Head Pose Estimation [22.08240141115053]
Estimating 3D human motion from an egocentric video sequence plays a critical role in human behavior understanding and has various applications in VR/AR.
We propose a new method, Ego-Body Pose Estimation via Ego-Head Pose Estimation (EgoEgo), which decomposes the problem into two stages, connected by the head motion as an intermediate representation.
This disentanglement of head and body pose eliminates the need for training datasets with paired egocentric videos and 3D human motion.
arXiv Detail & Related papers (2022-12-09T02:25:20Z) - FLEX: Full-Body Grasping Without Full-Body Grasps [24.10724524386518]
We address the task of generating a virtual human -- hands and full body -- grasping everyday objects.
Existing methods approach this problem by collecting a 3D dataset of humans interacting with objects and training on this data.
We leverage the existence of both full-body pose and hand grasping priors, composing them using 3D geometrical constraints to obtain full-body grasps.
arXiv Detail & Related papers (2022-11-21T23:12:54Z) - BEHAVE: Dataset and Method for Tracking Human Object Interactions [105.77368488612704]
We present the first full body human- object interaction dataset with multi-view RGBD frames and corresponding 3D SMPL and object fits along with the annotated contacts between them.
We use this data to learn a model that can jointly track humans and objects in natural environments with an easy-to-use portable multi-camera setup.
arXiv Detail & Related papers (2022-04-14T13:21:19Z) - D3D-HOI: Dynamic 3D Human-Object Interactions from Videos [49.38319295373466]
We introduce D3D-HOI: a dataset of monocular videos with ground truth annotations of 3D object pose, shape and part motion during human-object interactions.
Our dataset consists of several common articulated objects captured from diverse real-world scenes and camera viewpoints.
We leverage the estimated 3D human pose for more accurate inference of the object spatial layout and dynamics.
arXiv Detail & Related papers (2021-08-19T00:49:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.