HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization
- URL: http://arxiv.org/abs/2007.08943v1
- Date: Fri, 17 Jul 2020 12:44:23 GMT
- Title: HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization
- Authors: Jiahao Lin, Gim Hee Lee
- Abstract summary: We propose the Human Depth Estimation Network (HDNet), an end-to-end framework for absolute root joint localization.
A skeleton-based Graph Neural Network (GNN) is utilized to propagate features among joints.
We evaluate our HDNet on the root joint localization and root-relative 3D pose estimation tasks with two benchmark datasets.
- Score: 83.57863764231655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current works on multi-person 3D pose estimation mainly focus on the
estimation of the 3D joint locations relative to the root joint and ignore the
absolute locations of each pose. In this paper, we propose the Human Depth
Estimation Network (HDNet), an end-to-end framework for absolute root joint
localization in the camera coordinate space. Our HDNet first estimates the 2D
human pose with heatmaps of the joints. These estimated heatmaps serve as
attention masks for pooling features from image regions corresponding to the
target person. A skeleton-based Graph Neural Network (GNN) is utilized to
propagate features among joints. We formulate the target depth regression as a
bin index estimation problem, which can be transformed with a soft-argmax
operation from the classification output of our HDNet. We evaluate our HDNet on
the root joint localization and root-relative 3D pose estimation tasks with two
benchmark datasets, i.e., Human3.6M and MuPoTS-3D. The experimental results
show that we outperform the previous state-of-the-art consistently under
multiple evaluation metrics. Our source code is available at:
https://github.com/jiahaoLjh/HumanDepth.
Related papers
- Dual networks based 3D Multi-Person Pose Estimation from Monocular Video [42.01876518017639]
Multi-person 3D pose estimation is more challenging than single pose estimation.
Existing top-down and bottom-up approaches to pose estimation suffer from detection errors.
We propose the integration of top-down and bottom-up approaches to exploit their strengths.
arXiv Detail & Related papers (2022-05-02T08:53:38Z) - GCNDepth: Self-supervised Monocular Depth Estimation based on Graph
Convolutional Network [11.332580333969302]
This work brings a new solution with a set of improvements, which increase the quantitative and qualitative understanding of depth maps.
A graph convolutional network (GCN) can handle the convolution on non-Euclidean data and it can be applied to irregular image regions within a topological structure.
Our method provided comparable and promising results with a high prediction accuracy of 89% on the publicly KITTI and Make3D datasets.
arXiv Detail & Related papers (2021-12-13T16:46:25Z) - Soft Expectation and Deep Maximization for Image Feature Detection [68.8204255655161]
We propose SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space.
Our results show that this new model trained using SEDM is able to better localize the underlying 3D points in a scene.
arXiv Detail & Related papers (2021-04-21T00:35:32Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - On the role of depth predictions for 3D human pose estimation [0.04199844472131921]
We build a system that takes 2d joint locations as input along with their estimated depth value and predicts their 3d positions in camera coordinates.
Results are produced on neural network that accepts a low dimensional input and be integrated into a real-time system.
Our system can be combined with an off-the-shelf 2d pose detector and a depth map predictor to perform 3d pose estimation in the wild.
arXiv Detail & Related papers (2021-03-03T16:51:38Z) - Beyond Weak Perspective for Monocular 3D Human Pose Estimation [6.883305568568084]
We consider the task of 3D joints location and orientation prediction from a monocular video.
We first infer 2D joints locations with an off-the-shelf pose estimation algorithm.
We then adhere to the SMPLify algorithm which receives those initial parameters.
arXiv Detail & Related papers (2020-09-14T16:23:14Z) - Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh
Recovery from a 2D Human Pose [70.23652933572647]
We propose a novel graph convolutional neural network (GraphCNN)-based system that estimates the 3D coordinates of human mesh vertices directly from the 2D human pose.
We show that our Pose2Mesh outperforms the previous 3D human pose and mesh estimation methods on various benchmark datasets.
arXiv Detail & Related papers (2020-08-20T16:01:56Z) - Coherent Reconstruction of Multiple Humans from a Single Image [68.3319089392548]
In this work, we address the problem of multi-person 3D pose estimation from a single image.
A typical regression approach in the top-down setting of this problem would first detect all humans and then reconstruct each one of them independently.
Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.
arXiv Detail & Related papers (2020-06-15T17:51:45Z) - VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild
Environment [80.77351380961264]
We present an approach to estimate 3D poses of multiple people from multiple camera views.
We present an end-to-end solution which operates in the $3$D space, therefore avoids making incorrect decisions in the 2D space.
We propose Pose Regression Network (PRN) to estimate a detailed 3D pose for each proposal.
arXiv Detail & Related papers (2020-04-13T23:50:01Z) - Multi-Person Absolute 3D Human Pose Estimation with Weak Depth
Supervision [0.0]
We introduce a network that can be trained with additional RGB-D images in a weakly supervised fashion.
Our algorithm is a monocular, multi-person, absolute pose estimator.
We evaluate the algorithm on several benchmarks, showing a consistent improvement in error rates.
arXiv Detail & Related papers (2020-04-08T13:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.