D&D: Learning Human Dynamics from Dynamic Camera
- URL: http://arxiv.org/abs/2209.08790v1
- Date: Mon, 19 Sep 2022 06:51:02 GMT
- Title: D&D: Learning Human Dynamics from Dynamic Camera
- Authors: Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu
- Abstract summary: We present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera.
Our approach is entirely neural-based and runs without offline optimization or simulation in physics engines.
- Score: 55.60512353465175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D human pose estimation from a monocular video has recently seen significant
improvements. However, most state-of-the-art methods are kinematics-based,
which are prone to physically implausible motions with pronounced artifacts.
Current dynamics-based methods can predict physically plausible motion but are
restricted to simple scenarios with static camera view. In this work, we
present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the
laws of physics to reconstruct 3D human motion from the in-the-wild videos with
a moving camera. D&D introduces inertial force control (IFC) to explain the 3D
human motion in the non-inertial local frame by considering the inertial forces
of the dynamic camera. To learn the ground contact with limited annotations, we
develop probabilistic contact torque (PCT), which is computed by differentiable
sampling from contact probabilities and used to generate motions. The contact
state can be weakly supervised by encouraging the model to generate correct
motions. Furthermore, we propose an attentive PD controller that adjusts target
pose states using temporal information to obtain smooth and accurate pose
control. Our approach is entirely neural-based and runs without offline
optimization or simulation in physics engines. Experiments on large-scale 3D
human motion benchmarks demonstrate the effectiveness of D&D, where we exhibit
superior performance against both state-of-the-art kinematics-based and
dynamics-based methods. Code is available at https://github.com/Jeffsjtu/DnD
Related papers
- EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
We introduce EgoGaussian, the first method capable of simultaneously reconstructing 3D scenes and tracking 3D object motion from RGB egocentric input alone.
Our approach employs a clip-level online learning pipeline that leverages the dynamic nature of human activities.
arXiv Detail & Related papers (2024-06-28T10:39:36Z) - DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and
Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos.
Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.
Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z) - Physics-Guided Human Motion Capture with Pose Probability Modeling [35.159506668475565]
Existing solutions always adopt kinematic results as reference motions, and the physics is treated as a post-processing module.
We employ physics as denoising guidance in the reverse diffusion process to reconstruct human motion from a modeled pose probability distribution.
With several iterations, the physics-based tracking and kinematic denoising promote each other to generate a physically plausible human motion.
arXiv Detail & Related papers (2023-08-19T05:28:03Z) - Trajectory Optimization for Physics-Based Reconstruction of 3d Human
Pose from Monocular Video [31.96672354594643]
We focus on the task of estimating a physically plausible articulated human motion from monocular video.
Existing approaches that do not consider physics often produce temporally inconsistent output with motion artifacts.
We show that our approach achieves competitive results with respect to existing physics-based methods on the Human3.6M benchmark.
arXiv Detail & Related papers (2022-05-24T18:02:49Z) - Gravity-Aware Monocular 3D Human-Object Reconstruction [73.25185274561139]
This paper proposes a new approach for joint markerless 3D human motion capture and object trajectory estimation from monocular RGB videos.
We focus on scenes with objects partially observed during a free flight.
In the experiments, our approach achieves state-of-the-art accuracy in 3D human motion capture on various metrics.
arXiv Detail & Related papers (2021-08-19T17:59:57Z) - PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time [89.68248627276955]
Marker-less 3D motion capture from a single colour camera has seen significant progress.
However, it is a very challenging and severely ill-posed problem.
We present PhysCap, the first algorithm for physically plausible, real-time and marker-less human 3D motion capture.
arXiv Detail & Related papers (2020-08-20T10:46:32Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.