I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human
Pose and Mesh Estimation from a Single RGB Image
- URL: http://arxiv.org/abs/2008.03713v2
- Date: Sun, 1 Nov 2020 11:39:58 GMT
- Title: I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human
Pose and Mesh Estimation from a Single RGB Image
- Authors: Gyeongsik Moon and Kyoung Mu Lee
- Abstract summary: We propose I2L-MeshNet, an image-to-lixel (line+pixel) prediction network.
The proposed I2L-MeshNet predicts the per-lixel likelihood on 1D heatmaps for each mesh coordinate instead of directly regressing the parameters.
Our lixel-based 1D heatmap preserves the spatial relationship in the input image and models the prediction uncertainty.
- Score: 79.040930290399
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most of the previous image-based 3D human pose and mesh estimation methods
estimate parameters of the human mesh model from an input image. However,
directly regressing the parameters from the input image is a highly non-linear
mapping because it breaks the spatial relationship between pixels in the input
image. In addition, it cannot model the prediction uncertainty, which can make
training harder. To resolve the above issues, we propose I2L-MeshNet, an
image-to-lixel (line+pixel) prediction network. The proposed I2L-MeshNet
predicts the per-lixel likelihood on 1D heatmaps for each mesh vertex
coordinate instead of directly regressing the parameters. Our lixel-based 1D
heatmap preserves the spatial relationship in the input image and models the
prediction uncertainty. We demonstrate the benefit of the image-to-lixel
prediction and show that the proposed I2L-MeshNet outperforms previous methods.
The code is publicly available https://github.com/mks0601/I2L-MeshNet_RELEASE.
Related papers
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - Personalized 3D Human Pose and Shape Refinement [19.082329060985455]
regression-based methods have dominated the field of 3D human pose and shape estimation.
We propose to construct dense correspondences between initial human model estimates and the corresponding images.
We show that our approach not only consistently leads to better image-model alignment, but also to improved 3D accuracy.
arXiv Detail & Related papers (2024-03-18T10:13:53Z) - VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space [43.368963897752664]
This work introduces a novel paradigm to address the Human Pose and Shape Estimation problem.
Instead of predicting body model parameters, we focus on predicting the proposed discrete latent representation.
The proposed model, VQ-HPS, predicts the discrete latent representation of the mesh.
arXiv Detail & Related papers (2023-12-13T17:08:38Z) - Adversarial Parametric Pose Prior [106.12437086990853]
We learn a prior that restricts the SMPL parameters to values that produce realistic poses via adversarial training.
We show that our learned prior covers the diversity of the real-data distribution, facilitates optimization for 3D reconstruction from 2D keypoints, and yields better pose estimates when used for regression from images.
arXiv Detail & Related papers (2021-12-08T10:05:32Z) - 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment
Feedback Loop [128.07841893637337]
Regression-based methods have recently shown promising results in reconstructing human meshes from monocular images.
Minor deviation in parameters may lead to noticeable misalignment between the estimated meshes and image evidences.
We propose a Pyramidal Mesh Alignment Feedback (PyMAF) loop to leverage a feature pyramid and rectify the predicted parameters.
arXiv Detail & Related papers (2021-03-30T17:07:49Z) - LoopReg: Self-supervised Learning of Implicit Surface Correspondences,
Pose and Shape for 3D Human Mesh Registration [123.62341095156611]
LoopReg is an end-to-end learning framework to register a corpus of scans to a common 3D human model.
A backward map, parameterized by a Neural Network, predicts the correspondence from every scan point to the surface of the human model.
A forward map, parameterized by a human model, transforms the corresponding points back to the scan based on the model parameters (pose and shape)
arXiv Detail & Related papers (2020-10-23T14:39:50Z) - Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh
Recovery from a 2D Human Pose [70.23652933572647]
We propose a novel graph convolutional neural network (GraphCNN)-based system that estimates the 3D coordinates of human mesh vertices directly from the 2D human pose.
We show that our Pose2Mesh outperforms the previous 3D human pose and mesh estimation methods on various benchmark datasets.
arXiv Detail & Related papers (2020-08-20T16:01:56Z) - JGR-P2O: Joint Graph Reasoning based Pixel-to-Offset Prediction Network
for 3D Hand Pose Estimation from a Single Depth Image [28.753759115780515]
State-of-the-art single depth image-based 3D hand pose estimation methods are based on dense predictions.
A novel pixel-wise prediction-based method is proposed to address the above issues.
The proposed model is implemented with an efficient 2D fully convolutional network backbone and has only about 1.4M parameters.
arXiv Detail & Related papers (2020-07-09T08:57:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.