WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation
- URL: http://arxiv.org/abs/2501.02771v2
- Date: Mon, 20 Jan 2025 06:41:09 GMT
- Title: WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation
- Authors: Tianjian Jiang, Johsan Billingham, Sebastian Müksch, Juan Zarate, Nicolas Evans, Martin R. Oswald, Marc Pollefeys, Otmar Hilliges, Manuel Kaufmann, Jie Song,
- Abstract summary: WorldPose is a novel dataset for advancing research in multi-person global pose estimation in the wild.
We exploit the static multi-view setup of HD cameras to recover the 3D player poses and motions with unprecedented accuracy.
The resulting dataset comprises more than 80 sequences with approx 2.5 million 3D poses and a total traveling distance of over 120 km.
- Score: 67.28831601491447
- License:
- Abstract: We present WorldPose, a novel dataset for advancing research in multi-person global pose estimation in the wild, featuring footage from the 2022 FIFA World Cup. While previous datasets have primarily focused on local poses, often limited to a single person or in constrained, indoor settings, the infrastructure deployed for this sporting event allows access to multiple fixed and moving cameras in different stadiums. We exploit the static multi-view setup of HD cameras to recover the 3D player poses and motions with unprecedented accuracy given capture areas of more than 1.75 acres. We then leverage the captured players' motions and field markings to calibrate a moving broadcasting camera. The resulting dataset comprises more than 80 sequences with approx 2.5 million 3D poses and a total traveling distance of over 120 km. Subsequently, we conduct an in-depth analysis of the SOTA methods for global pose estimation. Our experiments demonstrate that WorldPose challenges existing multi-person techniques, supporting the potential for new research in this area and others, such as sports analysis. All pose annotations (in SMPL format), broadcasting camera parameters and footage will be released for academic research purposes.
Related papers
- SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap [102.5232204867158]
We formalize the task of Game State Reconstruction and introduce SoccerNet-GSR, a novel Game State Reconstruction dataset focusing on football videos.
SoccerNet-GSR is composed of 200 video sequences of 30 seconds, annotated with 9.37 million line points for pitch localization and camera calibration.
Our experiments show that GSR is a challenging novel task, which opens the field for future research.
arXiv Detail & Related papers (2024-04-17T12:53:45Z) - EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in
the Wild [31.787149079366877]
We present EMDB, the Electromagnetic Database of Global 3D Human Pose and Shape in the Wild.
EMDB contains high-quality 3D SMPL pose and shape parameters with global body and camera trajectories for in-the-wild videos.
We use body-worn, wireless electromagnetic (EM) sensors and a hand-held iPhone to record 58 minutes of motion data.
arXiv Detail & Related papers (2023-08-31T17:56:19Z) - Monocular 3D Human Pose Estimation for Sports Broadcasts using Partial
Sports Field Registration [0.0]
We combine advances in 2D human pose estimation and camera calibration via partial sports field registration to demonstrate an avenue for collecting valid large-scale kinematic datasets.
We generate a synthetic dataset of more than 10k images in Unreal Engine 5 with different viewpoints, running styles, and body types.
arXiv Detail & Related papers (2023-04-10T07:41:44Z) - SLOPER4D: A Scene-Aware Dataset for Global 4D Human Pose Estimation in
Urban Environments [0.0]
We present SLOPER4D, a novel scene-aware dataset collected in large urban environments.
We record 12 human subjects' activities over 10 diverse urban scenes from an egocentric view.
SLOPER4D consists of 15 sequences of human motions, each of which has a trajectory length of more than 200 meters.
arXiv Detail & Related papers (2023-03-16T05:54:15Z) - Scene-Aware 3D Multi-Human Motion Capture from a Single Camera [83.06768487435818]
We consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera.
We leverage recent advances in computer vision using large-scale pre-trained models for a variety of modalities, including 2D body joints, joint angles, normalized disparity maps, and human segmentation masks.
In particular, we estimate the scene depth and unique person scale from normalized disparity predictions using the 2D body joints and joint angles.
arXiv Detail & Related papers (2023-01-12T18:01:28Z) - SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in
Soccer Videos [62.686484228479095]
We propose a novel dataset for multiple object tracking composed of 200 sequences of 30s each.
The dataset is fully annotated with bounding boxes and tracklet IDs.
Our analysis shows that multiple player, referee and ball tracking in soccer videos is far from being solved.
arXiv Detail & Related papers (2022-04-14T12:22:12Z) - MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision [72.5863451123577]
We show how to train a neural model that can perform accurate 3D pose and camera estimation.
Our method outperforms both classical bundle adjustment and weakly-supervised monocular 3D baselines.
arXiv Detail & Related papers (2021-08-10T18:39:56Z) - CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the
Wild [31.334715988245748]
We propose a self-supervised approach that learns a single image 3D pose estimator from unlabeled multi-view data.
In contrast to most existing methods, we do not require calibrated cameras and can therefore learn from moving cameras.
Key to the success are new, unbiased reconstruction objectives that mix information across views and training samples.
arXiv Detail & Related papers (2020-11-30T10:42:27Z) - Towards Generalization of 3D Human Pose Estimation In The Wild [73.19542580408971]
3DBodyTex.Pose is a dataset that addresses the task of 3D human pose estimation in-the-wild.
3DBodyTex.Pose offers high quality and rich data containing 405 different real subjects in various clothing and poses, and 81k image samples with ground-truth 2D and 3D pose annotations.
arXiv Detail & Related papers (2020-04-21T13:31:58Z) - Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS [13.191601826570786]
We present a novel solution for multi-human 3D pose estimation from multiple calibrated camera views.
It takes 2D poses in different camera coordinates as inputs and aims for the accurate 3D poses in the global coordinate.
We propose a new large-scale multi-human dataset with 12 to 28 camera views.
arXiv Detail & Related papers (2020-03-09T08:54:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.