Back to the Future: Joint Aware Temporal Deep Learning 3D Human Pose
Estimation
- URL: http://arxiv.org/abs/2002.11251v1
- Date: Sat, 22 Feb 2020 10:11:13 GMT
- Title: Back to the Future: Joint Aware Temporal Deep Learning 3D Human Pose
Estimation
- Authors: Vikas Gupta
- Abstract summary: We propose a new deep learning network that introduces a deeper CNN channel filter and constraints as losses to reduce joint position and motion errors for 3D video human body pose estimation.
Our model outperforms the previous best result from the literature based on mean per-joint position error, velocity error, and acceleration errors.
Our contribution increasing positional accuracy and motion smoothness in video can be integrated with future end to end networks without increasing network complexity.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new deep learning network that introduces a deeper CNN channel
filter and constraints as losses to reduce joint position and motion errors for
3D video human body pose estimation. Our model outperforms the previous best
result from the literature based on mean per-joint position error, velocity
error, and acceleration errors on the Human 3.6M benchmark corresponding to a
new state-of-the-art mean error reduction in all protocols and motion metrics.
Mean per joint error is reduced by 1%, velocity error by 7% and acceleration by
13% compared to the best results from the literature. Our contribution
increasing positional accuracy and motion smoothness in video can be integrated
with future end to end networks without increasing network complexity. Our
model and code are available at https://vnmr.github.io/
Keywords: 3D, human, image, pose, action, detection, object, video, visual,
supervised, joint, kinematic
Related papers
- NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D
Human Pose and Shape Estimation [53.25973084799954]
We present NIKI (Neural Inverse Kinematics with Invertible Neural Network), which models bi-directional errors.
NIKI can learn from both the forward and inverse processes with invertible networks.
arXiv Detail & Related papers (2023-05-15T12:13:24Z) - MotionBERT: A Unified Perspective on Learning Human Motion
Representations [46.67364057245364]
We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.
We propose a pretraining stage in which a motion encoder is trained to recover the underlying 3D motion from noisy partial 2D observations.
We implement motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer) neural network.
arXiv Detail & Related papers (2022-10-12T19:46:25Z) - Learning Dynamical Human-Joint Affinity for 3D Pose Estimation in Videos [47.601288796052714]
Graph Convolution Network (GCN) has been successfully used for 3D human pose estimation in videos.
New Dynamical Graph Network (DGNet) can estimate 3D pose by adaptively learning spatial/temporal joint relations from videos.
arXiv Detail & Related papers (2021-09-15T15:06:19Z) - 3D Human Pose Regression using Graph Convolutional Network [68.8204255655161]
We propose a graph convolutional network named PoseGraphNet for 3D human pose regression from 2D poses.
Our model's performance is close to the state-of-the-art, but with much fewer parameters.
arXiv Detail & Related papers (2021-05-21T14:41:31Z) - We are More than Our Joints: Predicting how 3D Bodies Move [63.34072043909123]
We train a novel variational autoencoder that generates motions from latent frequencies.
Experiments show that our method produces state-of-the-art results and realistic 3D body animations.
arXiv Detail & Related papers (2020-12-01T16:41:04Z) - Multi-Scale Networks for 3D Human Pose Estimation with Inference Stage
Optimization [33.02708860641971]
Estimating 3D human poses from a monocular video is still a challenging task.
Many existing methods drop when the target person is cluded by other objects, or the motion is too fast/slow relative to the scale and speed of the training data.
We introduce atemporal-temporal network for robust 3D human pose estimation.
arXiv Detail & Related papers (2020-10-13T15:24:28Z) - Coherent Reconstruction of Multiple Humans from a Single Image [68.3319089392548]
In this work, we address the problem of multi-person 3D pose estimation from a single image.
A typical regression approach in the top-down setting of this problem would first detect all humans and then reconstruct each one of them independently.
Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.
arXiv Detail & Related papers (2020-06-15T17:51:45Z) - PoseNet3D: Learning Temporally Consistent 3D Human Pose via Knowledge
Distillation [6.023152721616894]
PoseNet3D takes 2D joints as input and outputs 3D skeletons and SMPL body model parameters.
We first train a teacher network that outputs 3D skeletons, using only 2D poses for training. The teacher network distills its knowledge to a student network that predicts 3D pose in SMPL representation.
Results on Human3.6M dataset for 3D human pose estimation demonstrate that our approach reduces the 3D joint prediction error by 18% compared to previous unsupervised methods.
arXiv Detail & Related papers (2020-03-07T00:10:59Z) - Anatomy-aware 3D Human Pose Estimation with Bone-based Pose
Decomposition [92.99291528676021]
Instead of directly regressing the 3D joint locations, we decompose the task into bone direction prediction and bone length prediction.
Our motivation is the fact that the bone lengths of a human skeleton remain consistent across time.
Our full model outperforms the previous best results on Human3.6M and MPI-INF-3DHP datasets.
arXiv Detail & Related papers (2020-02-24T15:49:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.