Learnable human mesh triangulation for 3D human pose and shape
estimation
- URL: http://arxiv.org/abs/2208.11251v1
- Date: Wed, 24 Aug 2022 01:11:57 GMT
- Title: Learnable human mesh triangulation for 3D human pose and shape
estimation
- Authors: Sungho Chun, Sungbum Park, Ju Yong Chang
- Abstract summary: The accuracy of joint rotation and shape estimation has received relatively little attention in the skinned multi-person linear model (SMPL)-based human mesh reconstruction from multi-view images.
We propose a two-stage method to resolve the ambiguity of joint rotation and shape reconstruction and the difficulty of network learning.
The proposed method significantly outperforms the previous works in terms of joint rotation and shape estimation, and achieves competitive performance in terms of joint location estimation.
- Score: 6.699132260402631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compared to joint position, the accuracy of joint rotation and shape
estimation has received relatively little attention in the skinned multi-person
linear model (SMPL)-based human mesh reconstruction from multi-view images. The
work in this field is broadly classified into two categories. The first
approach performs joint estimation and then produces SMPL parameters by fitting
SMPL to resultant joints. The second approach regresses SMPL parameters
directly from the input images through a convolutional neural network
(CNN)-based model. However, these approaches suffer from the lack of
information for resolving the ambiguity of joint rotation and shape
reconstruction and the difficulty of network learning. To solve the
aforementioned problems, we propose a two-stage method. The proposed method
first estimates the coordinates of mesh vertices through a CNN-based model from
input images, and acquires SMPL parameters by fitting the SMPL model to the
estimated vertices. Estimated mesh vertices provide sufficient information for
determining joint rotation and shape, and are easier to learn than SMPL
parameters. According to experiments using Human3.6M and MPI-INF-3DHP datasets,
the proposed method significantly outperforms the previous works in terms of
joint rotation and shape estimation, and achieves competitive performance in
terms of joint location estimation.
Related papers
- SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation [74.07836010698801]
We propose an SMPL-based Transformer framework (SMPLer) to address this issue.
SMPLer incorporates two key ingredients: a decoupled attention operation and an SMPL-based target representation.
Extensive experiments demonstrate the effectiveness of SMPLer against existing 3D human shape and pose estimation methods.
arXiv Detail & Related papers (2024-04-23T17:59:59Z) - Iterative Graph Filtering Network for 3D Human Pose Estimation [5.177947445379688]
Graph convolutional networks (GCNs) have proven to be an effective approach for 3D human pose estimation.
In this paper, we introduce an iterative graph filtering framework for 3D human pose estimation.
Our approach builds upon the idea of iteratively solving graph filtering with Laplacian regularization.
arXiv Detail & Related papers (2023-07-29T20:46:44Z) - (Fusionformer):Exploiting the Joint Motion Synergy with Fusion Network
Based On Transformer for 3D Human Pose Estimation [1.52292571922932]
Many previous methods lack the understanding of local joint information.cite8888987considers the temporal relationship of a single joint in this work.
Our proposed textbfFusionformer method introduces a global-temporal self-trajectory module and a cross-temporal self-trajectory module.
The results show an improvement of 2.4% MPJPE and 4.3% P-MPJPE on the Human3.6M dataset.
arXiv Detail & Related papers (2022-10-08T12:22:10Z) - A Model for Multi-View Residual Covariances based on Perspective
Deformation [88.21738020902411]
We derive a model for the covariance of the visual residuals in multi-view SfM, odometry and SLAM setups.
We validate our model with synthetic and real data and integrate it into photometric and feature-based Bundle Adjustment.
arXiv Detail & Related papers (2022-02-01T21:21:56Z) - Adversarial Parametric Pose Prior [106.12437086990853]
We learn a prior that restricts the SMPL parameters to values that produce realistic poses via adversarial training.
We show that our learned prior covers the diversity of the real-data distribution, facilitates optimization for 3D reconstruction from 2D keypoints, and yields better pose estimates when used for regression from images.
arXiv Detail & Related papers (2021-12-08T10:05:32Z) - Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images [79.70127290464514]
We decompose the task into two stages, i.e. person localization and pose estimation.
And we propose three task-specific graph neural networks for effective message passing.
Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
arXiv Detail & Related papers (2021-09-13T11:44:07Z) - 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment
Feedback Loop [128.07841893637337]
Regression-based methods have recently shown promising results in reconstructing human meshes from monocular images.
Minor deviation in parameters may lead to noticeable misalignment between the estimated meshes and image evidences.
We propose a Pyramidal Mesh Alignment Feedback (PyMAF) loop to leverage a feature pyramid and rectify the predicted parameters.
arXiv Detail & Related papers (2021-03-30T17:07:49Z) - Beyond Weak Perspective for Monocular 3D Human Pose Estimation [6.883305568568084]
We consider the task of 3D joints location and orientation prediction from a monocular video.
We first infer 2D joints locations with an off-the-shelf pose estimation algorithm.
We then adhere to the SMPLify algorithm which receives those initial parameters.
arXiv Detail & Related papers (2020-09-14T16:23:14Z) - HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose
and Shape Estimation [60.35776484235304]
This work attempts to address the uncertainty of lifting the detected 2D joints to the 3D space by introducing an intermediate state-Part-Centric Heatmap Triplets (HEMlets)
The HEMlets utilize three joint-heatmaps to represent the relative depth information of the end-joints for each skeletal body part.
A Convolutional Network (ConvNet) is first trained to predict HEMlets from the input image, followed by a volumetric joint-heatmap regression.
arXiv Detail & Related papers (2020-03-10T04:03:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.