Optimal-state Dynamics Estimation for Physics-based Human Motion Capture from Videos
- URL: http://arxiv.org/abs/2410.07795v2
- Date: Mon, 28 Oct 2024 09:36:25 GMT
- Title: Optimal-state Dynamics Estimation for Physics-based Human Motion Capture from Videos
- Authors: Cuong Le, Viktor Johansson, Manon Kok, Bastian Wandt,
- Abstract summary: We propose a novel method to selectively incorporate the physics models with the kinematics observations in an online setting.
A recurrent neural network is introduced to realize a Kalman filter that attentively balances the kinematics input and simulated motion.
The proposed approach excels in the physics-based human pose estimation task and demonstrates the physical plausibility of the predictive dynamics.
- Score: 6.093379844890164
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human motion capture from monocular videos has made significant progress in recent years. However, modern approaches often produce temporal artifacts, e.g. in form of jittery motion and struggle to achieve smooth and physically plausible motions. Explicitly integrating physics, in form of internal forces and exterior torques, helps alleviating these artifacts. Current state-of-the-art approaches make use of an automatic PD controller to predict torques and reaction forces in order to re-simulate the input kinematics, i.e. the joint angles of a predefined skeleton. However, due to imperfect physical models, these methods often require simplifying assumptions and extensive preprocessing of the input kinematics to achieve good performance. To this end, we propose a novel method to selectively incorporate the physics models with the kinematics observations in an online setting, inspired by a neural Kalman-filtering approach. We develop a control loop as a meta-PD controller to predict internal joint torques and external reaction forces, followed by a physics-based motion simulation. A recurrent neural network is introduced to realize a Kalman filter that attentively balances the kinematics input and simulated motion, resulting in an optimal-state dynamics prediction. We show that this filtering step is crucial to provide an online supervision that helps balancing the shortcoming of the respective input motions, thus being important for not only capturing accurate global motion trajectories but also producing physically plausible human poses. The proposed approach excels in the physics-based human pose estimation task and demonstrates the physical plausibility of the predictive dynamics, compared to state of the art. The code is available on https://github.com/cuongle1206/OSDCap
Related papers
- Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems [49.11170948406405]
State-of-the-art in automatic parameter estimation from video is addressed by training supervised deep networks on large datasets.
We propose a method to estimate the physical parameters of any known, continuous governing equation from single videos.
arXiv Detail & Related papers (2024-10-02T09:44:54Z) - Physics-Guided Human Motion Capture with Pose Probability Modeling [35.159506668475565]
Existing solutions always adopt kinematic results as reference motions, and the physics is treated as a post-processing module.
We employ physics as denoising guidance in the reverse diffusion process to reconstruct human motion from a modeled pose probability distribution.
With several iterations, the physics-based tracking and kinematic denoising promote each other to generate a physically plausible human motion.
arXiv Detail & Related papers (2023-08-19T05:28:03Z) - Skeleton2Humanoid: Animating Simulated Characters for
Physically-plausible Motion In-betweening [59.88594294676711]
Modern deep learning based motion synthesis approaches barely consider the physical plausibility of synthesized motions.
We propose a system Skeleton2Humanoid'' which performs physics-oriented motion correction at test time.
Experiments on the challenging LaFAN1 dataset show our system can outperform prior methods significantly in terms of both physical plausibility and accuracy.
arXiv Detail & Related papers (2022-10-09T16:15:34Z) - D&D: Learning Human Dynamics from Dynamic Camera [55.60512353465175]
We present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera.
Our approach is entirely neural-based and runs without offline optimization or simulation in physics engines.
arXiv Detail & Related papers (2022-09-19T06:51:02Z) - Trajectory Optimization for Physics-Based Reconstruction of 3d Human
Pose from Monocular Video [31.96672354594643]
We focus on the task of estimating a physically plausible articulated human motion from monocular video.
Existing approaches that do not consider physics often produce temporally inconsistent output with motion artifacts.
We show that our approach achieves competitive results with respect to existing physics-based methods on the Human3.6M benchmark.
arXiv Detail & Related papers (2022-05-24T18:02:49Z) - Differentiable Dynamics for Articulated 3d Human Motion Reconstruction [29.683633237503116]
We introduce DiffPhy, a differentiable physics-based model for articulated 3d human motion reconstruction from video.
We validate the model by demonstrating that it can accurately reconstruct physically plausible 3d human motion from monocular video.
arXiv Detail & Related papers (2022-05-24T17:58:37Z) - Neural MoCon: Neural Motion Control for Physically Plausible Human
Motion Capture [12.631678059354593]
We exploit the high-precision and non-differentiable physics simulator to incorporate dynamical constraints in motion capture.
Our key-idea is to use real physical supervisions to train a target pose distribution prior for sampling-based motion control.
Results show that we can obtain physically plausible human motion with complex terrain interactions, human shape variations, and diverse behaviors.
arXiv Detail & Related papers (2022-03-26T12:48:41Z) - Investigating Pose Representations and Motion Contexts Modeling for 3D
Motion Prediction [63.62263239934777]
We conduct an indepth study on various pose representations with a focus on their effects on the motion prediction task.
We propose a novel RNN architecture termed AHMR (Attentive Hierarchical Motion Recurrent network) for motion prediction.
Our approach outperforms the state-of-the-art methods in short-term prediction and achieves much enhanced long-term prediction proficiency.
arXiv Detail & Related papers (2021-12-30T10:45:22Z) - Generating Smooth Pose Sequences for Diverse Human Motion Prediction [90.45823619796674]
We introduce a unified deep generative network for both diverse and controllable motion prediction.
Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy.
arXiv Detail & Related papers (2021-08-19T00:58:00Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.