PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time
- URL: http://arxiv.org/abs/2008.08880v2
- Date: Wed, 9 Dec 2020 14:18:55 GMT
- Title: PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time
- Authors: Soshi Shimada, Vladislav Golyanik, Weipeng Xu, Christian Theobalt
- Abstract summary: Marker-less 3D motion capture from a single colour camera has seen significant progress.
However, it is a very challenging and severely ill-posed problem.
We present PhysCap, the first algorithm for physically plausible, real-time and marker-less human 3D motion capture.
- Score: 89.68248627276955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Marker-less 3D human motion capture from a single colour camera has seen
significant progress. However, it is a very challenging and severely ill-posed
problem. In consequence, even the most accurate state-of-the-art approaches
have significant limitations. Purely kinematic formulations on the basis of
individual joints or skeletons, and the frequent frame-wise reconstruction in
state-of-the-art methods greatly limit 3D accuracy and temporal stability
compared to multi-view or marker-based motion capture. Further, captured 3D
poses are often physically incorrect and biomechanically implausible, or
exhibit implausible environment interactions (floor penetration, foot skating,
unnatural body leaning and strong shifting in depth), which is problematic for
any use case in computer graphics. We, therefore, present PhysCap, the first
algorithm for physically plausible, real-time and marker-less human 3D motion
capture with a single colour camera at 25 fps. Our algorithm first captures 3D
human poses purely kinematically. To this end, a CNN infers 2D and 3D joint
positions, and subsequently, an inverse kinematics step finds space-time
coherent joint angles and global 3D pose. Next, these kinematic reconstructions
are used as constraints in a real-time physics-based pose optimiser that
accounts for environment constraints (e.g., collision handling and floor
placement), gravity, and biophysical plausibility of human postures. Our
approach employs a combination of ground reaction force and residual force for
plausible root control, and uses a trained neural network to detect foot
contact events in images. Our method captures physically plausible and
temporally stable global 3D human motion, without physically implausible
postures, floor penetrations or foot skating, from video in real time and in
general scenes. The video is available at
http://gvv.mpi-inf.mpg.de/projects/PhysCap
Related papers
- ExFMan: Rendering 3D Dynamic Humans with Hybrid Monocular Blurry Frames and Events [7.820081911598502]
We propose ExFMan, the first neural rendering framework that renders high-quality humans in rapid motion with a hybrid frame-based RGB and bio-inspired event camera.
We first formulate a velocity field of the 3D body in the canonical space and render it to image space to identify the body parts with motion blur.
We then propose two novel losses, i.e., velocity-aware photometric loss and velocity-relative event loss, to optimize the neural human for both modalities.
arXiv Detail & Related papers (2024-09-21T10:58:01Z) - D&D: Learning Human Dynamics from Dynamic Camera [55.60512353465175]
We present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera.
Our approach is entirely neural-based and runs without offline optimization or simulation in physics engines.
arXiv Detail & Related papers (2022-09-19T06:51:02Z) - Trajectory Optimization for Physics-Based Reconstruction of 3d Human
Pose from Monocular Video [31.96672354594643]
We focus on the task of estimating a physically plausible articulated human motion from monocular video.
Existing approaches that do not consider physics often produce temporally inconsistent output with motion artifacts.
We show that our approach achieves competitive results with respect to existing physics-based methods on the Human3.6M benchmark.
arXiv Detail & Related papers (2022-05-24T18:02:49Z) - Learning Motion Priors for 4D Human Body Capture in 3D Scenes [81.54377747405812]
We propose LEMO: LEarning human MOtion priors for 4D human body capture.
We introduce a novel motion prior, which reduces the jitters exhibited by poses recovered over a sequence.
We also design a contact friction term and a contact-aware motion infiller obtained via per-instance self-supervised training.
With our pipeline, we demonstrate high-quality 4D human body capture, reconstructing smooth motions and physically plausible body-scene interactions.
arXiv Detail & Related papers (2021-08-23T20:47:09Z) - Neural Monocular 3D Human Motion Capture with Physical Awareness [76.55971509794598]
We present a new trainable system for physically plausible markerless 3D human motion capture.
Unlike most neural methods for human motion capture, our approach is aware of physical and environmental constraints.
It produces smooth and physically principled 3D motions in an interactive frame rate in a wide variety of challenging scenes.
arXiv Detail & Related papers (2021-05-03T17:57:07Z) - Human POSEitioning System (HPS): 3D Human Pose Estimation and
Self-localization in Large Scenes from Body-Mounted Sensors [71.29186299435423]
We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment.
We show that our optimization-based integration exploits the benefits of the two, resulting in pose accuracy free of drift.
HPS could be used for VR/AR applications where humans interact with the scene without requiring direct line of sight with an external camera.
arXiv Detail & Related papers (2021-03-31T17:58:31Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.