Scene-Aware 3D Multi-Human Motion Capture from a Single Camera
- URL: http://arxiv.org/abs/2301.05175v3
- Date: Mon, 27 Mar 2023 06:59:55 GMT
- Title: Scene-Aware 3D Multi-Human Motion Capture from a Single Camera
- Authors: Diogo Luvizon, Marc Habermann, Vladislav Golyanik, Adam Kortylewski,
Christian Theobalt
- Abstract summary: We consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera.
We leverage recent advances in computer vision using large-scale pre-trained models for a variety of modalities, including 2D body joints, joint angles, normalized disparity maps, and human segmentation masks.
In particular, we estimate the scene depth and unique person scale from normalized disparity predictions using the 2D body joints and joint angles.
- Score: 83.06768487435818
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we consider the problem of estimating the 3D position of
multiple humans in a scene as well as their body shape and articulation from a
single RGB video recorded with a static camera. In contrast to expensive
marker-based or multi-view systems, our lightweight setup is ideal for private
users as it enables an affordable 3D motion capture that is easy to install and
does not require expert knowledge. To deal with this challenging setting, we
leverage recent advances in computer vision using large-scale pre-trained
models for a variety of modalities, including 2D body joints, joint angles,
normalized disparity maps, and human segmentation masks. Thus, we introduce the
first non-linear optimization-based approach that jointly solves for the
absolute 3D position of each human, their articulated pose, their individual
shapes as well as the scale of the scene. In particular, we estimate the scene
depth and person unique scale from normalized disparity predictions using the
2D body joints and joint angles. Given the per-frame scene depth, we
reconstruct a point-cloud of the static scene in 3D space. Finally, given the
per-frame 3D estimates of the humans and scene point-cloud, we perform a
space-time coherent optimization over the video to ensure temporal, spatial and
physical plausibility. We evaluate our method on established multi-person 3D
human pose benchmarks where we consistently outperform previous methods and we
qualitatively demonstrate that our method is robust to in-the-wild conditions
including challenging scenes with people of different sizes.
Related papers
- Self-learning Canonical Space for Multi-view 3D Human Pose Estimation [57.969696744428475]
Multi-view 3D human pose estimation is naturally superior to single view one.
The accurate annotation of these information is hard to obtain.
We propose a fully self-supervised framework, named cascaded multi-view aggregating network (CMANet)
CMANet is superior to state-of-the-art methods in extensive quantitative and qualitative analysis.
arXiv Detail & Related papers (2024-03-19T04:54:59Z) - Weakly Supervised 3D Multi-person Pose Estimation for Large-scale Scenes
based on Monocular Camera and Single LiDAR [41.39277657279448]
We propose a monocular camera and single LiDAR-based method for 3D multi-person pose estimation in large-scale scenes.
Specifically, we design an effective fusion strategy to take advantage of multi-modal input data, including images and point cloud.
Our method exploits the inherent geometry constraints of point cloud for self-supervision and utilizes 2D keypoints on images for weak supervision.
arXiv Detail & Related papers (2022-11-30T12:50:40Z) - Embodied Scene-aware Human Pose Estimation [25.094152307452]
We propose embodied scene-aware human pose estimation.
Our method is one stage, causal, and recovers global 3D human poses in a simulated environment.
arXiv Detail & Related papers (2022-06-18T03:50:19Z) - MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision [72.5863451123577]
We show how to train a neural model that can perform accurate 3D pose and camera estimation.
Our method outperforms both classical bundle adjustment and weakly-supervised monocular 3D baselines.
arXiv Detail & Related papers (2021-08-10T18:39:56Z) - 3DCrowdNet: 2D Human Pose-Guided3D Crowd Human Pose and Shape Estimation
in the Wild [61.92656990496212]
3DCrowdNet is a 2D human pose-guided 3D crowd pose and shape estimation system for in-the-wild scenes.
We show that our 3DCrowdNet outperforms previous methods on in-the-wild crowd scenes.
arXiv Detail & Related papers (2021-04-15T08:21:28Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - Human POSEitioning System (HPS): 3D Human Pose Estimation and
Self-localization in Large Scenes from Body-Mounted Sensors [71.29186299435423]
We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment.
We show that our optimization-based integration exploits the benefits of the two, resulting in pose accuracy free of drift.
HPS could be used for VR/AR applications where humans interact with the scene without requiring direct line of sight with an external camera.
arXiv Detail & Related papers (2021-03-31T17:58:31Z) - SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation [46.85865451812981]
We propose a novel system that first regresses a set of 2.5D representations of body parts and then reconstructs the 3D absolute poses based on these 2.5D representations with a depth-aware part association algorithm.
Such a single-shot bottom-up scheme allows the system to better learn and reason about the inter-person depth relationship, improving both 3D and 2D pose estimation.
arXiv Detail & Related papers (2020-08-26T09:56:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.