Differentiable Dynamics for Articulated 3d Human Motion Reconstruction
- URL: http://arxiv.org/abs/2205.12256v1
- Date: Tue, 24 May 2022 17:58:37 GMT
- Title: Differentiable Dynamics for Articulated 3d Human Motion Reconstruction
- Authors: Erik G\"artner, Mykhaylo Andriluka, Erwin Coumans, Cristian
Sminchisescu
- Abstract summary: We introduce DiffPhy, a differentiable physics-based model for articulated 3d human motion reconstruction from video.
We validate the model by demonstrating that it can accurately reconstruct physically plausible 3d human motion from monocular video.
- Score: 29.683633237503116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce DiffPhy, a differentiable physics-based model for articulated 3d
human motion reconstruction from video. Applications of physics-based reasoning
in human motion analysis have so far been limited, both by the complexity of
constructing adequate physical models of articulated human motion, and by the
formidable challenges of performing stable and efficient inference with physics
in the loop. We jointly address such modeling and inference challenges by
proposing an approach that combines a physically plausible body representation
with anatomical joint limits, a differentiable physics simulator, and
optimization techniques that ensure good performance and robustness to
suboptimal local optima. In contrast to several recent methods, our approach
readily supports full-body contact including interactions with objects in the
scene. Most importantly, our model connects end-to-end with images, thus
supporting direct gradient-based physics optimization by means of image-based
loss functions. We validate the model by demonstrating that it can accurately
reconstruct physically plausible 3d human motion from monocular video, both on
public benchmarks with available 3d ground-truth, and on videos from the
internet.
Related papers
- Learned Neural Physics Simulation for Articulated 3D Human Pose Reconstruction [30.51621591645056]
We propose a novel neural network approach, LARP, to model the dynamics of articulated human motion with contact.
Our neural architecture supports features typically found in traditional physics simulators.
To demonstrate the value of LARP we use it as a drop-in replacement for a state of the art classical non-differentiable simulator in an existing video-based reconstruction framework.
arXiv Detail & Related papers (2024-10-15T19:42:45Z) - MultiPhys: Multi-Person Physics-aware 3D Motion Estimation [28.91813849219037]
We introduce MultiPhys, a method designed for recovering multi-person motion from monocular videos.
Our focus lies in capturing coherent spatial placement between pairs of individuals across varying degrees of engagement.
We devise a pipeline in which the motion estimated by a kinematic-based method is fed into a physics simulator in an autoregressive manner.
arXiv Detail & Related papers (2024-04-18T08:29:29Z) - Skeleton2Humanoid: Animating Simulated Characters for
Physically-plausible Motion In-betweening [59.88594294676711]
Modern deep learning based motion synthesis approaches barely consider the physical plausibility of synthesized motions.
We propose a system Skeleton2Humanoid'' which performs physics-oriented motion correction at test time.
Experiments on the challenging LaFAN1 dataset show our system can outperform prior methods significantly in terms of both physical plausibility and accuracy.
arXiv Detail & Related papers (2022-10-09T16:15:34Z) - D&D: Learning Human Dynamics from Dynamic Camera [55.60512353465175]
We present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera.
Our approach is entirely neural-based and runs without offline optimization or simulation in physics engines.
arXiv Detail & Related papers (2022-09-19T06:51:02Z) - Trajectory Optimization for Physics-Based Reconstruction of 3d Human
Pose from Monocular Video [31.96672354594643]
We focus on the task of estimating a physically plausible articulated human motion from monocular video.
Existing approaches that do not consider physics often produce temporally inconsistent output with motion artifacts.
We show that our approach achieves competitive results with respect to existing physics-based methods on the Human3.6M benchmark.
arXiv Detail & Related papers (2022-05-24T18:02:49Z) - Dynamic Visual Reasoning by Learning Differentiable Physics Models from
Video and Language [92.7638697243969]
We propose a unified framework that can jointly learn visual concepts and infer physics models of objects from videos and language.
This is achieved by seamlessly integrating three components: a visual perception module, a concept learner, and a differentiable physics engine.
arXiv Detail & Related papers (2021-10-28T17:59:13Z) - Physics-based Human Motion Estimation and Synthesis from Videos [0.0]
We propose a framework for training generative models of physically plausible human motion directly from monocular RGB videos.
At the core of our method is a novel optimization formulation that corrects imperfect image-based pose estimations.
Results show that our physically-corrected motions significantly outperform prior work on pose estimation.
arXiv Detail & Related papers (2021-09-21T01:57:54Z) - Learning Local Recurrent Models for Human Mesh Recovery [50.85467243778406]
We present a new method for video mesh recovery that divides the human mesh into several local parts following the standard skeletal model.
We then model the dynamics of each local part with separate recurrent models, with each model conditioned appropriately based on the known kinematic structure of the human body.
This results in a structure-informed local recurrent learning architecture that can be trained in an end-to-end fashion with available annotations.
arXiv Detail & Related papers (2021-07-27T14:30:33Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z) - Occlusion resistant learning of intuitive physics from videos [52.25308231683798]
Key ability for artificial systems is to understand physical interactions between objects, and predict future outcomes of a situation.
This ability, often referred to as intuitive physics, has recently received attention and several methods were proposed to learn these physical rules from video sequences.
arXiv Detail & Related papers (2020-04-30T19:35:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.