EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in
the Wild
- URL: http://arxiv.org/abs/2308.16894v1
- Date: Thu, 31 Aug 2023 17:56:19 GMT
- Title: EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in
the Wild
- Authors: Manuel Kaufmann, Jie Song, Chen Guo, Kaiyue Shen, Tianjian Jiang,
Chengcheng Tang, Juan Zarate, Otmar Hilliges
- Abstract summary: We present EMDB, the Electromagnetic Database of Global 3D Human Pose and Shape in the Wild.
EMDB contains high-quality 3D SMPL pose and shape parameters with global body and camera trajectories for in-the-wild videos.
We use body-worn, wireless electromagnetic (EM) sensors and a hand-held iPhone to record 58 minutes of motion data.
- Score: 31.787149079366877
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present EMDB, the Electromagnetic Database of Global 3D Human Pose and
Shape in the Wild. EMDB is a novel dataset that contains high-quality 3D SMPL
pose and shape parameters with global body and camera trajectories for
in-the-wild videos. We use body-worn, wireless electromagnetic (EM) sensors and
a hand-held iPhone to record a total of 58 minutes of motion data, distributed
over 81 indoor and outdoor sequences and 10 participants. Together with
accurate body poses and shapes, we also provide global camera poses and body
root trajectories. To construct EMDB, we propose a multi-stage optimization
procedure, which first fits SMPL to the 6-DoF EM measurements and then refines
the poses via image observations. To achieve high-quality results, we leverage
a neural implicit avatar model to reconstruct detailed human surface geometry
and appearance, which allows for improved alignment and smoothness via a dense
pixel-level objective. Our evaluations, conducted with a multi-view volumetric
capture system, indicate that EMDB has an expected accuracy of 2.3 cm
positional and 10.6 degrees angular error, surpassing the accuracy of previous
in-the-wild datasets. We evaluate existing state-of-the-art monocular RGB
methods for camera-relative and global pose estimation on EMDB. EMDB is
publicly available under https://ait.ethz.ch/emdb
Related papers
- Reconstructing People, Places, and Cameras [57.81696692335401]
"Humans and Structure from Motion" (HSfM) is a method for jointly reconstructing multiple human meshes, scene point clouds, and camera parameters in a metric world coordinate system.
Our results show that incorporating human data into the SfM pipeline improves camera pose estimation.
arXiv Detail & Related papers (2024-12-23T18:58:34Z) - CameraHMR: Aligning People with Perspective [54.05758012879385]
We address the challenge of accurate 3D human pose and shape estimation from monocular images.
Existing training datasets containing real images with pseudo ground truth (pGT) use SMPLify to fit SMPL to sparse 2D joint locations.
We make two contributions that improve pGT accuracy.
arXiv Detail & Related papers (2024-11-12T19:12:12Z) - DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and
Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos.
Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.
Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z) - Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot [22.848563931757962]
We present Multi-HMR, a strong sigle-shot model for multi-person 3D human mesh recovery from a single RGB image.
Predictions encompass the whole body, including hands and facial expressions, using the SMPL-X parametric model.
We show that incorporating it into the training data further enhances predictions, particularly for hands.
arXiv Detail & Related papers (2024-02-22T16:05:13Z) - PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape
Prediction [77.89935657608926]
We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images.
PF-LRM simultaneously estimates the relative camera poses in 1.3 seconds on a single A100 GPU.
arXiv Detail & Related papers (2023-11-20T18:57:55Z) - Human Pose Estimation in Monocular Omnidirectional Top-View Images [3.07869141026886]
We propose a new dataset for training and evaluation of CNNs for the task of keypoint detection in omnidirectional images.
The training dataset, THEODORE+, consists of 50,000 images and is created by a 3D rendering engine.
For evaluation purposes, the real-world PoseFES dataset with two scenarios and 701 frames with up to eight persons per scene was captured and annotated.
arXiv Detail & Related papers (2023-04-17T11:52:04Z) - Scene-Aware 3D Multi-Human Motion Capture from a Single Camera [83.06768487435818]
We consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera.
We leverage recent advances in computer vision using large-scale pre-trained models for a variety of modalities, including 2D body joints, joint angles, normalized disparity maps, and human segmentation masks.
In particular, we estimate the scene depth and unique person scale from normalized disparity predictions using the 2D body joints and joint angles.
arXiv Detail & Related papers (2023-01-12T18:01:28Z) - Camera Motion Agnostic 3D Human Pose Estimation [8.090223360924004]
This paper presents a camera motion agnostic approach for predicting 3D human pose and mesh defined in the world coordinate system.
We propose a network based on bidirectional gated recurrent units (GRUs) that predicts the global motion sequence from the local pose sequence.
We use 3DPW and synthetic datasets, which are constructed in a moving-camera environment, for evaluation.
arXiv Detail & Related papers (2021-12-01T08:22:50Z) - Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated
Convolution [34.301501457959056]
We propose a temporal regression network with a gated convolution module to transform 2D joints to 3D.
A simple yet effective localization approach is also conducted to transform the normalized pose to the global trajectory.
Our proposed method outperforms most state-of-the-art 2D-to-3D pose estimation methods.
arXiv Detail & Related papers (2020-10-31T04:35:24Z) - Towards Generalization of 3D Human Pose Estimation In The Wild [73.19542580408971]
3DBodyTex.Pose is a dataset that addresses the task of 3D human pose estimation in-the-wild.
3DBodyTex.Pose offers high quality and rich data containing 405 different real subjects in various clothing and poses, and 81k image samples with ground-truth 2D and 3D pose annotations.
arXiv Detail & Related papers (2020-04-21T13:31:58Z) - Synergetic Reconstruction from 2D Pose and 3D Motion for Wide-Space
Multi-Person Video Motion Capture in the Wild [3.0015034534260665]
We propose a markerless motion capture method with accuracy and smoothness from multiple cameras.
The proposed method predicts each persons 3D pose and determines bounding box of multi-camera images.
We evaluated the proposed method using various datasets and a real sports field.
arXiv Detail & Related papers (2020-01-16T02:14:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.