Decomposed Human Motion Prior for Video Pose Estimation via Adversarial
Training
- URL: http://arxiv.org/abs/2305.18743v3
- Date: Sun, 24 Sep 2023 08:21:55 GMT
- Title: Decomposed Human Motion Prior for Video Pose Estimation via Adversarial
Training
- Authors: Wenshuo Chen, Xiang Zhou, Zhengdi Yu, Weixi Gu and Kai Zhang
- Abstract summary: We propose to decompose holistic motion prior to joint motion prior, making it easier for neural networks to learn from prior knowledge.
We also utilize a novel regularization loss to balance accuracy and smoothness introduced by motion prior.
Our method achieves 9% lower PA-MPJPE and 29% lower acceleration error than previous methods tested on 3DPW.
- Score: 7.861513525154702
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating human pose from video is a task that receives considerable
attention due to its applicability in numerous 3D fields. The complexity of
prior knowledge of human body movements poses a challenge to neural network
models in the task of regressing keypoints. In this paper, we address this
problem by incorporating motion prior in an adversarial way. Different from
previous methods, we propose to decompose holistic motion prior to joint motion
prior, making it easier for neural networks to learn from prior knowledge
thereby boosting the performance on the task. We also utilize a novel
regularization loss to balance accuracy and smoothness introduced by motion
prior. Our method achieves 9\% lower PA-MPJPE and 29\% lower acceleration error
than previous methods tested on 3DPW. The estimator proves its robustness by
achieving impressive performance on in-the-wild dataset.
Related papers
- COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation [98.05046790227561]
COIN is a control-inpainting motion diffusion prior that enables fine-grained control to disentangle human and camera motions.
COIN outperforms the state-of-the-art methods in terms of global human motion estimation and camera motion estimation.
arXiv Detail & Related papers (2024-08-29T10:36:29Z) - Past Movements-Guided Motion Representation Learning for Human Motion Prediction [0.0]
We propose a self-supervised learning framework designed to enhance motion representation.
The framework consists of two stages: first, the network is pretrained through the self-reconstruction of past sequences, and the guided reconstruction of future sequences based on past movements.
Our method reduces the average prediction errors by 8.8% across Human3.6, 3DPW, and AMASS datasets.
arXiv Detail & Related papers (2024-08-04T17:00:37Z) - HMP: Hand Motion Priors for Pose and Shape Estimation from Video [52.39020275278984]
We develop a generative motion prior specific for hands, trained on the AMASS dataset which features diverse and high-quality hand motions.
Our integration of a robust motion prior significantly enhances performance, especially in occluded scenarios.
We demonstrate our method's efficacy via qualitative and quantitative evaluations on the HO3D and DexYCB datasets.
arXiv Detail & Related papers (2023-12-27T22:35:33Z) - Deep learning-based approaches for human motion decoding in smart
walkers for rehabilitation [3.8791511769387634]
Smart walkers should be able to decode human motion and needs, as early as possible.
Current walkers decode motion intention using information of wearable or embedded sensors.
A contactless approach is proposed, addressing human motion decoding as an early action recognition/detection problematic.
arXiv Detail & Related papers (2023-01-13T14:29:44Z) - Koopman pose predictions for temporally consistent human walking
estimations [11.016730029019522]
We introduce a new factor graph factor based on Koopman theory that embeds the nonlinear dynamics of lower-limb movement activities.
We show that our approach reduces outliers on the skeleton form by almost 1 m, while preserving natural walking trajectories at depths up to more than 10 m.
arXiv Detail & Related papers (2022-05-05T16:16:06Z) - Investigating Pose Representations and Motion Contexts Modeling for 3D
Motion Prediction [63.62263239934777]
We conduct an indepth study on various pose representations with a focus on their effects on the motion prediction task.
We propose a novel RNN architecture termed AHMR (Attentive Hierarchical Motion Recurrent network) for motion prediction.
Our approach outperforms the state-of-the-art methods in short-term prediction and achieves much enhanced long-term prediction proficiency.
arXiv Detail & Related papers (2021-12-30T10:45:22Z) - Locally Aware Piecewise Transformation Fields for 3D Human Mesh
Registration [67.69257782645789]
We propose piecewise transformation fields that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space.
We show that fitting parametric models with poses by our network results in much better registration quality, especially for extreme poses.
arXiv Detail & Related papers (2021-04-16T15:16:09Z) - Adversarial Refinement Network for Human Motion Prediction [61.50462663314644]
Two popular methods, recurrent neural networks and feed-forward deep networks, are able to predict rough motion trend.
We propose an Adversarial Refinement Network (ARNet) following a simple yet effective coarse-to-fine mechanism with novel adversarial error augmentation.
arXiv Detail & Related papers (2020-11-23T05:42:20Z) - Human Motion Transfer from Poses in the Wild [61.6016458288803]
We tackle the problem of human motion transfer, where we synthesize novel motion video for a target person that imitates the movement from a reference video.
It is a video-to-video translation task in which the estimated poses are used to bridge two domains.
We introduce a novel pose-to-video translation framework for generating high-quality videos that are temporally coherent even for in-the-wild pose sequences unseen during training.
arXiv Detail & Related papers (2020-04-07T05:59:53Z) - Back to the Future: Joint Aware Temporal Deep Learning 3D Human Pose
Estimation [0.0]
We propose a new deep learning network that introduces a deeper CNN channel filter and constraints as losses to reduce joint position and motion errors for 3D video human body pose estimation.
Our model outperforms the previous best result from the literature based on mean per-joint position error, velocity error, and acceleration errors.
Our contribution increasing positional accuracy and motion smoothness in video can be integrated with future end to end networks without increasing network complexity.
arXiv Detail & Related papers (2020-02-22T10:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.