THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers
- URL: http://arxiv.org/abs/2106.09336v1
- Date: Thu, 17 Jun 2021 09:09:24 GMT
- Title: THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers
- Authors: Mihai Zanfir, Andrei Zanfir, Eduard Gabriel Bazavan, William T.
Freeman, Rahul Sukthankar and Cristian Sminchisescu
- Abstract summary: THUNDR is a transformer-based deep neural network methodology to reconstruct the 3d pose and shape of people.
We show state-of-the-art results on Human3.6M and 3DPW, for both the fully-supervised and the self-supervised models.
We observe very solid 3d reconstruction performance for difficult human poses collected in the wild.
- Score: 67.8628917474705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present THUNDR, a transformer-based deep neural network methodology to
reconstruct the 3d pose and shape of people, given monocular RGB images. Key to
our methodology is an intermediate 3d marker representation, where we aim to
combine the predictive power of model-free-output architectures and the
regularizing, anthropometrically-preserving properties of a statistical human
surface model like GHUM -- a recently introduced, expressive full body
statistical 3d human model, trained end-to-end. Our novel transformer-based
prediction pipeline can focus on image regions relevant to the task, supports
self-supervised regimes, and ensures that solutions are consistent with human
anthropometry. We show state-of-the-art results on Human3.6M and 3DPW, for both
the fully-supervised and the self-supervised models, for the task of inferring
3d human shape, joint positions, and global translation. Moreover, we observe
very solid 3d reconstruction performance for difficult human poses collected in
the wild.
Related papers
- Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation [32.30055363306321]
We propose a paradigm for seamlessly unifying different human pose and shape-related tasks and datasets.
Our formulation is centered on the ability to query any arbitrary point of the human volume, and obtain its estimated location in 3D.
arXiv Detail & Related papers (2024-07-10T10:44:18Z) - ConvFormer: Parameter Reduction in Transformer Models for 3D Human Pose
Estimation by Leveraging Dynamic Multi-Headed Convolutional Attention [0.0]
textbftextitConvFormer is a novel convolutional transformer for the 3D human pose estimation task.
We have validated our method on three common benchmark datasets: Human3.6M, MPI-INF-3DHP, and HumanEva.
arXiv Detail & Related papers (2023-04-04T22:23:50Z) - A Modular Multi-stage Lightweight Graph Transformer Network for Human
Pose and Shape Estimation from 2D Human Pose [4.598337780022892]
We introduce a pose-based human mesh reconstruction approach that prioritizes computational efficiency without sacrificing reconstruction accuracy.
Our method consists of a 2D-to-3D lifter module that utilizes graph transformers to analyze structured and implicit joint correlations in 2D human poses, and a mesh regression module that combines the extracted pose features with a mesh template to produce the final human mesh parameters.
arXiv Detail & Related papers (2023-01-31T04:42:47Z) - LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human
Bodies [78.17425779503047]
We propose a novel neural implicit representation for the human body.
It is fully differentiable and optimizable with disentangled shape and pose latent spaces.
Our model can be trained and fine-tuned directly on non-watertight raw data with well-designed losses.
arXiv Detail & Related papers (2021-11-30T04:10:57Z) - 3D Human Pose Estimation with Spatial and Temporal Transformers [59.433208652418976]
We present PoseFormer, a purely transformer-based approach for 3D human pose estimation in videos.
Inspired by recent developments in vision transformers, we design a spatial-temporal transformer structure.
We quantitatively and qualitatively evaluate our method on two popular and standard benchmark datasets.
arXiv Detail & Related papers (2021-03-18T18:14:37Z) - 3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous
Image Data [77.57798334776353]
We consider the problem of obtaining dense 3D reconstructions of humans from single and partially occluded views.
We suggest that ambiguities can be modelled more effectively by parametrizing the possible body shapes and poses.
We show that our method outperforms alternative approaches in ambiguous pose recovery on standard benchmarks for 3D humans.
arXiv Detail & Related papers (2020-11-02T13:55:31Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - PaMIR: Parametric Model-Conditioned Implicit Representation for
Image-based Human Reconstruction [67.08350202974434]
We propose Parametric Model-Conditioned Implicit Representation (PaMIR), which combines the parametric body model with the free-form deep implicit function.
We show that our method achieves state-of-the-art performance for image-based 3D human reconstruction in the cases of challenging poses and clothing types.
arXiv Detail & Related papers (2020-07-08T02:26:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.