Social EgoMesh Estimation
- URL: http://arxiv.org/abs/2411.04598v1
- Date: Thu, 07 Nov 2024 10:28:49 GMT
- Title: Social EgoMesh Estimation
- Authors: Luca Scofano, Alessio Sampieri, Edoardo De Matteis, Indro Spinelli, Fabio Galasso,
- Abstract summary: We propose a novel framework for Socialcentric Estimation of body MEshes (SEE-ME)
Our approach is the first to estimate the wearer's mesh using only a latent probabilistic diffusion model.
Overall, SEE-ME surpasses the current best technique, reducing the pose estimation error (MPJPE) by 53%.
- Score: 7.021561988248192
- License:
- Abstract: Accurately estimating the 3D pose of the camera wearer in egocentric video sequences is crucial to modeling human behavior in virtual and augmented reality applications. The task presents unique challenges due to the limited visibility of the user's body caused by the front-facing camera mounted on their head. Recent research has explored the utilization of the scene and ego-motion, but it has overlooked humans' interactive nature. We propose a novel framework for Social Egocentric Estimation of body MEshes (SEE-ME). Our approach is the first to estimate the wearer's mesh using only a latent probabilistic diffusion model, which we condition on the scene and, for the first time, on the social wearer-interactee interactions. Our in-depth study sheds light on when social interaction matters most for ego-mesh estimation; it quantifies the impact of interpersonal distance and gaze direction. Overall, SEE-ME surpasses the current best technique, reducing the pose estimation error (MPJPE) by 53%. The code is available at https://github.com/L-Scofano/SEEME.
Related papers
- Estimating Body and Hand Motion in an Ego-sensed World [64.08911275906544]
We present EgoAllo, a system for human motion estimation from a head-mounted device.
Using only egocentric SLAM poses and images, EgoAllo guides sampling from a conditional diffusion model to estimate 3D body pose, height, and hand parameters.
arXiv Detail & Related papers (2024-10-04T17:59:57Z) - EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views [51.53089073920215]
Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-centric perception.
Existing methods primarily leverage observations of HOI to capture interaction regions from an exocentric view.
We present EgoChoir, which links object structures with interaction contexts inherent in appearance and head motion to reveal object affordance.
arXiv Detail & Related papers (2024-05-22T14:03:48Z) - Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration.
In this work, we tackle the task of reconstructing closely interactive humans from a monocular video.
We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z) - 3D Human Pose Perception from Egocentric Stereo Videos [67.9563319914377]
We propose a new transformer-based framework to improve egocentric stereo 3D human pose estimation.
Our method is able to accurately estimate human poses even in challenging scenarios, such as crouching and sitting.
We will release UnrealEgo2, UnrealEgo-RW, and trained models on our project page.
arXiv Detail & Related papers (2023-12-30T21:21:54Z) - Generative Proxemics: A Prior for 3D Social Interaction from Images [32.547187575678464]
Social interaction is a fundamental aspect of human behavior and communication.
We present a novel approach that learns a prior over the 3D proxemics two people in close social interaction.
Our approach recovers accurate and plausible 3D social interactions from noisy initial estimates, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2023-06-15T17:59:20Z) - Ego-Body Pose Estimation via Ego-Head Pose Estimation [22.08240141115053]
Estimating 3D human motion from an egocentric video sequence plays a critical role in human behavior understanding and has various applications in VR/AR.
We propose a new method, Ego-Body Pose Estimation via Ego-Head Pose Estimation (EgoEgo), which decomposes the problem into two stages, connected by the head motion as an intermediate representation.
This disentanglement of head and body pose eliminates the need for training datasets with paired egocentric videos and 3D human motion.
arXiv Detail & Related papers (2022-12-09T02:25:20Z) - EgoBody: Human Body Shape, Motion and Social Interactions from
Head-Mounted Devices [76.50816193153098]
EgoBody is a novel large-scale dataset for social interactions in complex 3D scenes.
We employ Microsoft HoloLens2 headsets to record rich egocentric data streams including RGB, depth, eye gaze, head and hand tracking.
To obtain accurate 3D ground-truth, we calibrate the headset with a multi-Kinect rig and fit expressive SMPL-X body meshes to multi-view RGB-D frames.
arXiv Detail & Related papers (2021-12-14T18:41:28Z) - Perceiving Humans: from Monocular 3D Localization to Social Distancing [93.03056743850141]
We present a new cost-effective vision-based method that perceives humans' locations in 3D and their body orientation from a single image.
We show that it is possible to rethink the concept of "social distancing" as a form of social interaction in contrast to a simple location-based rule.
arXiv Detail & Related papers (2020-09-01T10:12:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.