Towards Single Camera Human 3D-Kinematics
- URL: http://arxiv.org/abs/2301.05435v1
- Date: Fri, 13 Jan 2023 08:44:09 GMT
- Title: Towards Single Camera Human 3D-Kinematics
- Authors: Marian Bittner, Wei-Tse Yang, Xucong Zhang, Ajay Seth, Jan van Gemert
and Frans C. T. van der Helm
- Abstract summary: We propose a novel approach for direct 3D human kinematic estimation D3KE from videos using deep neural networks.
Our experiments demonstrate that the proposed end-to-end training is robust and outperforms 2D and 3D markerless motion capture based kinematic estimation pipelines.
- Score: 15.559206592078425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Markerless estimation of 3D Kinematics has the great potential to clinically
diagnose and monitor movement disorders without referrals to expensive motion
capture labs; however, current approaches are limited by performing multiple
de-coupled steps to estimate the kinematics of a person from videos. Most
current techniques work in a multi-step approach by first detecting the pose of
the body and then fitting a musculoskeletal model to the data for accurate
kinematic estimation. Errors in training data of the pose detection algorithms,
model scaling, as well the requirement of multiple cameras limit the use of
these techniques in a clinical setting. Our goal is to pave the way toward
fast, easily applicable and accurate 3D kinematic estimation \xdeleted{in a
clinical setting}. To this end, we propose a novel approach for direct 3D human
kinematic estimation D3KE from videos using deep neural networks. Our
experiments demonstrate that the proposed end-to-end training is robust and
outperforms 2D and 3D markerless motion capture based kinematic estimation
pipelines in terms of joint angles error by a large margin (35\% from 5.44 to
3.54 degrees). We show that D3KE is superior to the multi-step approach and can
run at video framerate speeds. This technology shows the potential for clinical
analysis from mobile devices in the future.
Related papers
- 3D Kinematics Estimation from Video with a Biomechanical Model and
Synthetic Training Data [4.130944152992895]
We propose a novel biomechanics-aware network that directly outputs 3D kinematics from two input views.
Our experiments demonstrate that the proposed approach, only trained on synthetic data, outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2024-02-20T17:33:40Z) - HMP: Hand Motion Priors for Pose and Shape Estimation from Video [52.39020275278984]
We develop a generative motion prior specific for hands, trained on the AMASS dataset which features diverse and high-quality hand motions.
Our integration of a robust motion prior significantly enhances performance, especially in occluded scenarios.
We demonstrate our method's efficacy via qualitative and quantitative evaluations on the HO3D and DexYCB datasets.
arXiv Detail & Related papers (2023-12-27T22:35:33Z) - Scene-Aware 3D Multi-Human Motion Capture from a Single Camera [83.06768487435818]
We consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera.
We leverage recent advances in computer vision using large-scale pre-trained models for a variety of modalities, including 2D body joints, joint angles, normalized disparity maps, and human segmentation masks.
In particular, we estimate the scene depth and unique person scale from normalized disparity predictions using the 2D body joints and joint angles.
arXiv Detail & Related papers (2023-01-12T18:01:28Z) - MotionBERT: A Unified Perspective on Learning Human Motion
Representations [46.67364057245364]
We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.
We propose a pretraining stage in which a motion encoder is trained to recover the underlying 3D motion from noisy partial 2D observations.
We implement motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer) neural network.
arXiv Detail & Related papers (2022-10-12T19:46:25Z) - Supervised learning for improving the accuracy of robot-mounted 3D
camera applied to human gait analysis [0.31171750528972203]
The use of 3D cameras for gait analysis has been highly questioned due to the low accuracy they have demonstrated in the past.
The 3D camera was mounted in a mobile robot to obtain a longer walking distance.
This study shows an improvement in detection of kinematic gait signals and gait descriptors by post-processing the raw estimations of the camera.
arXiv Detail & Related papers (2022-07-03T10:35:18Z) - AcinoSet: A 3D Pose Estimation Dataset and Baseline Models for Cheetahs
in the Wild [51.35013619649463]
We present an extensive dataset of free-running cheetahs in the wild, called AcinoSet.
The dataset contains 119,490 frames of multi-view synchronized high-speed video footage, camera calibration files and 7,588 human-annotated frames.
The resulting 3D trajectories, human-checked 3D ground truth, and an interactive tool to inspect the data is also provided.
arXiv Detail & Related papers (2021-03-24T15:54:11Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z) - Kinematic 3D Object Detection in Monocular Video [123.7119180923524]
We propose a novel method for monocular video-based 3D object detection which carefully leverages kinematic motion to improve precision of 3D localization.
We achieve state-of-the-art performance on monocular 3D object detection and the Bird's Eye View tasks within the KITTI self-driving dataset.
arXiv Detail & Related papers (2020-07-19T01:15:12Z) - Synergetic Reconstruction from 2D Pose and 3D Motion for Wide-Space
Multi-Person Video Motion Capture in the Wild [3.0015034534260665]
We propose a markerless motion capture method with accuracy and smoothness from multiple cameras.
The proposed method predicts each persons 3D pose and determines bounding box of multi-camera images.
We evaluated the proposed method using various datasets and a real sports field.
arXiv Detail & Related papers (2020-01-16T02:14:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.