MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human
Pose Estimation
- URL: http://arxiv.org/abs/2007.07227v2
- Date: Sat, 14 Nov 2020 19:32:45 GMT
- Title: MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human
Pose Estimation
- Authors: Istv\'an S\'ar\'andi and Timm Linder and Kai O. Arras and Bastian
Leibe
- Abstract summary: We propose metric-scale truncation-robust (MeTRo) volumetric heatmaps, whose dimensions are all defined in metric 3D space, instead of being aligned with image space.
This reinterpretation of heatmap dimensions allows us to directly estimate complete, metric-scale poses without test-time knowledge of distance or relying on anthropometrics, such as bone lengths.
We find that supervision via absolute pose loss is crucial for accurate non-root-relative localization.
- Score: 16.463390330757132
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Heatmap representations have formed the basis of human pose estimation
systems for many years, and their extension to 3D has been a fruitful line of
recent research. This includes 2.5D volumetric heatmaps, whose X and Y axes
correspond to image space and Z to metric depth around the subject. To obtain
metric-scale predictions, 2.5D methods need a separate post-processing step to
resolve scale ambiguity. Further, they cannot localize body joints outside the
image boundaries, leading to incomplete estimates for truncated images. To
address these limitations, we propose metric-scale truncation-robust (MeTRo)
volumetric heatmaps, whose dimensions are all defined in metric 3D space,
instead of being aligned with image space. This reinterpretation of heatmap
dimensions allows us to directly estimate complete, metric-scale poses without
test-time knowledge of distance or relying on anthropometric heuristics, such
as bone lengths. To further demonstrate the utility our representation, we
present a differentiable combination of our 3D metric-scale heatmaps with 2D
image-space ones to estimate absolute 3D pose (our MeTRAbs architecture). We
find that supervision via absolute pose loss is crucial for accurate
non-root-relative localization. Using a ResNet-50 backbone without further
learned layers, we obtain state-of-the-art results on Human3.6M, MPI-INF-3DHP
and MuPoTS-3D. Our code will be made publicly available to facilitate further
research.
Related papers
- Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences [21.057940424318314]
Given two images, we can estimate the relative camera pose between them by establishing image-to-image correspondences.
We present MicKey, a keypoint matching pipeline that is able to predict metric correspondences in 3D camera space.
arXiv Detail & Related papers (2024-04-09T14:22:50Z) - Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh
Reconstruction [66.10717041384625]
Zolly is the first 3DHMR method focusing on perspective-distorted images.
We propose a new camera model and a novel 2D representation, termed distortion image, which describes the 2D dense distortion scale of the human body.
We extend two real-world datasets tailored for this task, all containing perspective-distorted human images.
arXiv Detail & Related papers (2023-03-24T04:22:41Z) - Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps [66.24554680709417]
Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications.
We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
arXiv Detail & Related papers (2022-07-06T08:52:12Z) - SPGNet: Spatial Projection Guided 3D Human Pose Estimation in Low
Dimensional Space [14.81199315166042]
We propose a method for 3D human pose estimation that mixes multi-dimensional re-projection into supervised learning.
Based on the estimation results for the dataset Human3.6M, our approach outperforms many state-of-the-art methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-06-04T00:51:00Z) - Category-Level Metric Scale Object Shape and Pose Estimation [73.92460712829188]
We propose a framework that jointly estimates a metric scale shape and pose from a single RGB image.
We validated our method on both synthetic and real-world datasets to evaluate category-level object pose and shape.
arXiv Detail & Related papers (2021-09-01T12:16:46Z) - VoxelTrack: Multi-Person 3D Human Pose Estimation and Tracking in the
Wild [98.69191256693703]
We present VoxelTrack for multi-person 3D pose estimation and tracking from a few cameras which are separated by wide baselines.
It employs a multi-branch network to jointly estimate 3D poses and re-identification (Re-ID) features for all people in the environment.
It outperforms the state-of-the-art methods by a large margin on three public datasets including Shelf, Campus and CMU Panoptic.
arXiv Detail & Related papers (2021-08-05T08:35:44Z) - Weakly-supervised Cross-view 3D Human Pose Estimation [16.045255544594625]
We propose a simple yet effective pipeline for weakly-supervised cross-view 3D human pose estimation.
Our method can achieve state-of-the-art performance in a weakly-supervised manner.
We evaluate our method on the standard benchmark dataset, Human3.6M.
arXiv Detail & Related papers (2021-05-23T08:16:25Z) - SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation [46.85865451812981]
We propose a novel system that first regresses a set of 2.5D representations of body parts and then reconstructs the 3D absolute poses based on these 2.5D representations with a depth-aware part association algorithm.
Such a single-shot bottom-up scheme allows the system to better learn and reason about the inter-person depth relationship, improving both 3D and 2D pose estimation.
arXiv Detail & Related papers (2020-08-26T09:56:07Z) - HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization [83.57863764231655]
We propose the Human Depth Estimation Network (HDNet), an end-to-end framework for absolute root joint localization.
A skeleton-based Graph Neural Network (GNN) is utilized to propagate features among joints.
We evaluate our HDNet on the root joint localization and root-relative 3D pose estimation tasks with two benchmark datasets.
arXiv Detail & Related papers (2020-07-17T12:44:23Z) - Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A
Geometric Approach [76.10879433430466]
We propose to estimate 3D human pose from multi-view images and a few IMUs attached at person's limbs.
It operates by firstly detecting 2D poses from the two signals, and then lifting them to the 3D space.
The simple two-step approach reduces the error of the state-of-the-art by a large margin on a public dataset.
arXiv Detail & Related papers (2020-03-25T00:26:54Z) - Metric-Scale Truncation-Robust Heatmaps for 3D Human Pose Estimation [16.463390330757132]
We propose metric-scale truncation-robust volumetric heatmaps, whose dimensions are defined in metric 3D space near the subject.
We train a fully-convolutional network to estimate such heatmaps from monocular RGB in an end-to-end manner.
As our method is simple and fast, it can become a useful component for real-time top-down multi-person pose estimation systems.
arXiv Detail & Related papers (2020-03-05T22:38:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.