Neural Monocular 3D Human Motion Capture with Physical Awareness
- URL: http://arxiv.org/abs/2105.01057v1
- Date: Mon, 3 May 2021 17:57:07 GMT
- Title: Neural Monocular 3D Human Motion Capture with Physical Awareness
- Authors: Soshi Shimada and Vladislav Golyanik and Weipeng Xu and Patrick
P\'erez and Christian Theobalt
- Abstract summary: We present a new trainable system for physically plausible markerless 3D human motion capture.
Unlike most neural methods for human motion capture, our approach is aware of physical and environmental constraints.
It produces smooth and physically principled 3D motions in an interactive frame rate in a wide variety of challenging scenes.
- Score: 76.55971509794598
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new trainable system for physically plausible markerless 3D
human motion capture, which achieves state-of-the-art results in a broad range
of challenging scenarios. Unlike most neural methods for human motion capture,
our approach, which we dub physionical, is aware of physical and environmental
constraints. It combines in a fully differentiable way several key innovations,
i.e., 1. a proportional-derivative controller, with gains predicted by a neural
network, that reduces delays even in the presence of fast motions, 2. an
explicit rigid body dynamics model and 3. a novel optimisation layer that
prevents physically implausible foot-floor penetration as a hard constraint.
The inputs to our system are 2D joint keypoints, which are canonicalised in a
novel way so as to reduce the dependency on intrinsic camera parameters -- both
at train and test time. This enables more accurate global translation
estimation without generalisability loss. Our model can be finetuned only with
2D annotations when the 3D annotations are not available. It produces smooth
and physically principled 3D motions in an interactive frame rate in a wide
variety of challenging scenes, including newly recorded ones. Its advantages
are especially noticeable on in-the-wild sequences that significantly differ
from common 3D pose estimation benchmarks such as Human 3.6M and MPI-INF-3DHP.
Qualitative results are available at
http://gvv.mpi-inf.mpg.de/projects/PhysAware/
Related papers
- Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs [15.017274891943162]
Temporal 3D human pose estimation from monocular videos is a challenging task in human-centered computer vision.
Inertial sensor has been introduced to provide complementary source of information.
It remains challenging to integrate heterogeneous sensor data for producing physically rational 3D human poses.
arXiv Detail & Related papers (2024-04-27T09:02:42Z) - NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D
Human Pose and Shape Estimation [53.25973084799954]
We present NIKI (Neural Inverse Kinematics with Invertible Neural Network), which models bi-directional errors.
NIKI can learn from both the forward and inverse processes with invertible networks.
arXiv Detail & Related papers (2023-05-15T12:13:24Z) - MotionBERT: A Unified Perspective on Learning Human Motion
Representations [46.67364057245364]
We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.
We propose a pretraining stage in which a motion encoder is trained to recover the underlying 3D motion from noisy partial 2D observations.
We implement motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer) neural network.
arXiv Detail & Related papers (2022-10-12T19:46:25Z) - MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks [77.56526918859345]
We present a novel framework that brings the 3D motion task from controlled environments to in-the-wild scenarios.
It is capable of body motion from a character in a 2D monocular video to a 3D character without using any motion capture system or 3D reconstruction procedure.
arXiv Detail & Related papers (2021-12-19T07:52:05Z) - PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time [89.68248627276955]
Marker-less 3D motion capture from a single colour camera has seen significant progress.
However, it is a very challenging and severely ill-posed problem.
We present PhysCap, the first algorithm for physically plausible, real-time and marker-less human 3D motion capture.
arXiv Detail & Related papers (2020-08-20T10:46:32Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z) - Motion Guided 3D Pose Estimation from Videos [81.14443206968444]
We propose a new loss function, called motion loss, for the problem of monocular 3D Human pose estimation from 2D pose.
In computing motion loss, a simple yet effective representation for keypoint motion, called pairwise motion encoding, is introduced.
We design a new graph convolutional network architecture, U-shaped GCN (UGCN), which captures both short-term and long-term motion information.
arXiv Detail & Related papers (2020-04-29T06:59:30Z) - 3D Human Pose Estimation using Spatio-Temporal Networks with Explicit
Occlusion Training [40.933783830017035]
Estimating 3D poses from a monocular task is still a challenging task, despite the significant progress that has been made in recent years.
We introduce a-temporal video network for robust 3D human pose estimation.
We apply multi-scale spatial features for 2D joints or keypoints prediction in each individual frame, and multistride temporal convolutional net-works (TCNs) to estimate 3D joints or keypoints.
arXiv Detail & Related papers (2020-04-07T09:12:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.