Optimising 2D Pose Representation: Improve Accuracy, Stability and
Generalisability Within Unsupervised 2D-3D Human Pose Estimation
- URL: http://arxiv.org/abs/2209.00618v1
- Date: Thu, 1 Sep 2022 17:32:52 GMT
- Title: Optimising 2D Pose Representation: Improve Accuracy, Stability and
Generalisability Within Unsupervised 2D-3D Human Pose Estimation
- Authors: Peter Hardy, Srinandan Dasmahapatra, Hansung Kim
- Abstract summary: We show that the most optimal representation of a 2D pose is that of two independent segments, the torso and legs, with no shared features between each lifting network.
Our results show that the most optimal representation of a 2D pose is that of two independent segments, the torso and legs, with no shared features between each lifting network.
- Score: 7.294965109944706
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the problem of 2D pose representation during
unsupervised 2D to 3D pose lifting to improve the accuracy, stability and
generalisability of 3D human pose estimation (HPE) models. All unsupervised
2D-3D HPE approaches provide the entire 2D kinematic skeleton to a model during
training. We argue that this is sub-optimal and disruptive as long-range
correlations are induced between independent 2D key points and predicted 3D
ordinates during training. To this end, we conduct the following study. With a
maximum architecture capacity of 6 residual blocks, we evaluate the performance
of 5 models which each represent a 2D pose differently during the adversarial
unsupervised 2D-3D HPE process. Additionally, we show the correlations between
2D key points which are learned during the training process, highlighting the
unintuitive correlations induced when an entire 2D pose is provided to a
lifting model. Our results show that the most optimal representation of a 2D
pose is that of two independent segments, the torso and legs, with no shared
features between each lifting network. This approach decreased the average
error by 20\% on the Human3.6M dataset when compared to a model with a near
identical parameter count trained on the entire 2D kinematic skeleton.
Furthermore, due to the complex nature of adversarial learning, we show how
this representation can also improve convergence during training allowing for
an optimum result to be obtained more often.
Related papers
- UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - LInKs "Lifting Independent Keypoints" -- Partial Pose Lifting for
Occlusion Handling with Improved Accuracy in 2D-3D Human Pose Estimation [4.648549457266638]
We present LInKs, a novel unsupervised learning method to recover 3D human poses from 2D kinematic skeletons.
Our approach follows a unique two-step process, which involves first lifting the occluded 2D pose to the 3D domain.
This lift-then-fill approach leads to significantly more accurate results compared to models that complete the pose in 2D space alone.
arXiv Detail & Related papers (2023-09-13T18:28:04Z) - "Teaching Independent Parts Separately"(TIPSy-GAN) : Improving Accuracy
and Stability in Unsupervised Adversarial 2D to 3D Human Pose Estimation [7.294965109944706]
We present TIPSy-GAN, a new approach to improve the accuracy and stability in unsupervised adversarial 2D to 3D human pose estimation.
In our work we demonstrate that the human kinematic skeleton should not be assumed as one spatially codependent structure.
arXiv Detail & Related papers (2022-05-12T09:40:25Z) - PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and
Hallucination under Self-supervision [102.48681650013698]
Existing self-supervised 3D human pose estimation schemes have largely relied on weak supervisions to guide the learning.
We propose a novel self-supervised approach that allows us to explicitly generate 2D-3D pose pairs for augmenting supervision.
This is made possible via introducing a reinforcement-learning-based imitator, which is learned jointly with a pose estimator alongside a pose hallucinator.
arXiv Detail & Related papers (2022-03-29T14:45:53Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose [36.384824115033304]
We propose an approach to learning a compact view-invariant embedding space from 2D body joint keypoints, without explicitly predicting 3D poses.
Experimental results show that our embedding model achieves higher accuracy when retrieving similar poses across different camera views.
arXiv Detail & Related papers (2020-10-23T17:58:35Z) - Multi-Scale Networks for 3D Human Pose Estimation with Inference Stage
Optimization [33.02708860641971]
Estimating 3D human poses from a monocular video is still a challenging task.
Many existing methods drop when the target person is cluded by other objects, or the motion is too fast/slow relative to the scale and speed of the training data.
We introduce atemporal-temporal network for robust 3D human pose estimation.
arXiv Detail & Related papers (2020-10-13T15:24:28Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z) - Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point
Problem [98.92148855291363]
This paper proposes a deep CNN model which simultaneously solves for both 6-DoF absolute camera pose 2D--3D correspondences.
Tests on both real and simulated data have shown that our method substantially outperforms existing approaches.
arXiv Detail & Related papers (2020-03-15T04:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.