Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the
Wild
- URL: http://arxiv.org/abs/2003.07581v1
- Date: Tue, 17 Mar 2020 08:47:16 GMT
- Title: Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the
Wild
- Authors: Umar Iqbal and Pavlo Molchanov and Jan Kautz
- Abstract summary: We propose a weakly-supervised approach that does not require 3D annotations and learns to estimate 3D poses from unlabeled multi-view data.
We evaluate our proposed approach on two large scale datasets.
- Score: 101.70320427145388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One major challenge for monocular 3D human pose estimation in-the-wild is the
acquisition of training data that contains unconstrained images annotated with
accurate 3D poses. In this paper, we address this challenge by proposing a
weakly-supervised approach that does not require 3D annotations and learns to
estimate 3D poses from unlabeled multi-view data, which can be acquired easily
in in-the-wild environments. We propose a novel end-to-end learning framework
that enables weakly-supervised training using multi-view consistency. Since
multi-view consistency is prone to degenerated solutions, we adopt a 2.5D pose
representation and propose a novel objective function that can only be
minimized when the predictions of the trained model are consistent and
plausible across all camera views. We evaluate our proposed approach on two
large scale datasets (Human3.6M and MPII-INF-3DHP) where it achieves
state-of-the-art performance among semi-/weakly-supervised methods.
Related papers
- UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - Weakly Supervised 3D Multi-person Pose Estimation for Large-scale Scenes
based on Monocular Camera and Single LiDAR [41.39277657279448]
We propose a monocular camera and single LiDAR-based method for 3D multi-person pose estimation in large-scale scenes.
Specifically, we design an effective fusion strategy to take advantage of multi-modal input data, including images and point cloud.
Our method exploits the inherent geometry constraints of point cloud for self-supervision and utilizes 2D keypoints on images for weak supervision.
arXiv Detail & Related papers (2022-11-30T12:50:40Z) - On Triangulation as a Form of Self-Supervision for 3D Human Pose
Estimation [57.766049538913926]
Supervised approaches to 3D pose estimation from single images are remarkably effective when labeled data is abundant.
Much of the recent attention has shifted towards semi and (or) weakly supervised learning.
We propose to impose multi-view geometrical constraints by means of a differentiable triangulation and to use it as form of self-supervision during training when no labels are available.
arXiv Detail & Related papers (2022-03-29T19:11:54Z) - CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the
Wild [31.334715988245748]
We propose a self-supervised approach that learns a single image 3D pose estimator from unlabeled multi-view data.
In contrast to most existing methods, we do not require calibrated cameras and can therefore learn from moving cameras.
Key to the success are new, unbiased reconstruction objectives that mix information across views and training samples.
arXiv Detail & Related papers (2020-11-30T10:42:27Z) - Unsupervised Cross-Modal Alignment for Multi-Person 3D Pose Estimation [52.94078950641959]
We present a deployment friendly, fast bottom-up framework for multi-person 3D human pose estimation.
We adopt a novel neural representation of multi-person 3D pose which unifies the position of person instances with their corresponding 3D pose representation.
We propose a practical deployment paradigm where paired 2D or 3D pose annotations are unavailable.
arXiv Detail & Related papers (2020-08-04T07:54:25Z) - Kinematic-Structure-Preserved Representation for Unsupervised 3D Human
Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable.
We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions.
Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.