End-to-End Human Pose and Mesh Reconstruction with Transformers
- URL: http://arxiv.org/abs/2012.09760v2
- Date: Sun, 28 Mar 2021 01:20:42 GMT
- Title: End-to-End Human Pose and Mesh Reconstruction with Transformers
- Authors: Kevin Lin, Lijuan Wang, Zicheng Liu
- Abstract summary: We present a new method, called MEsh TRansfOrmer (METRO), to reconstruct 3D human pose and mesh vertices from a single image.
METRO does not rely on any parametric mesh models like SMPL, thus it can be easily extended to other objects such as hands.
We demonstrate the generalizability of METRO to 3D hand reconstruction in the wild, outperforming existing state-of-the-art methods on FreiHAND dataset.
- Score: 17.75480888764098
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new method, called MEsh TRansfOrmer (METRO), to reconstruct 3D
human pose and mesh vertices from a single image. Our method uses a transformer
encoder to jointly model vertex-vertex and vertex-joint interactions, and
outputs 3D joint coordinates and mesh vertices simultaneously. Compared to
existing techniques that regress pose and shape parameters, METRO does not rely
on any parametric mesh models like SMPL, thus it can be easily extended to
other objects such as hands. We further relax the mesh topology and allow the
transformer self-attention mechanism to freely attend between any two vertices,
making it possible to learn non-local relationships among mesh vertices and
joints. With the proposed masked vertex modeling, our method is more robust and
effective in handling challenging situations like partial occlusions. METRO
generates new state-of-the-art results for human mesh reconstruction on the
public Human3.6M and 3DPW datasets. Moreover, we demonstrate the
generalizability of METRO to 3D hand reconstruction in the wild, outperforming
existing state-of-the-art methods on FreiHAND dataset.
Related papers
- PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance [66.40153183581894]
We introduce a generic and scalable mesh generation framework PivotMesh.
PivotMesh makes an initial attempt to extend the native mesh generation to large-scale datasets.
We show that PivotMesh can generate compact and sharp 3D meshes across various categories.
arXiv Detail & Related papers (2024-05-27T07:13:13Z) - Decaf: Monocular Deformation Capture for Face and Hand Interactions [77.75726740605748]
This paper introduces the first method that allows tracking human hands interacting with human faces in 3D from single monocular RGB videos.
We model hands as articulated objects inducing non-rigid face deformations during an active interaction.
Our method relies on a new hand-face motion and interaction capture dataset with realistic face deformations acquired with a markerless multi-view camera system.
arXiv Detail & Related papers (2023-09-28T17:59:51Z) - Sampling is Matter: Point-guided 3D Human Mesh Reconstruction [0.0]
This paper presents a simple yet powerful method for 3D human mesh reconstruction from a single RGB image.
Experimental results on benchmark datasets show that the proposed method efficiently improves the performance of 3D human mesh reconstruction.
arXiv Detail & Related papers (2023-04-19T08:45:26Z) - MeMaHand: Exploiting Mesh-Mano Interaction for Single Image Two-Hand
Reconstruction [19.82874341207336]
We propose to reconstruct meshes and estimate MANO parameters of two hands from a single RGB image simultaneously.
MMIB consists of one graph residual block to aggregate local information and two transformer encoders to model long-range dependencies.
Experiments on the InterHand2.6M benchmark demonstrate promising results over the state-of-the-art hand reconstruction methods.
arXiv Detail & Related papers (2023-03-28T04:06:02Z) - Surface-Aligned Neural Radiance Fields for Controllable 3D Human
Synthesis [4.597864989500202]
We propose a new method for reconstructing implicit 3D human models from sparse multi-view RGB videos.
Our method defines the neural scene representation on the mesh surface points and signed distances from the surface of a human body mesh.
arXiv Detail & Related papers (2022-01-05T16:25:32Z) - Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z) - KAMA: 3D Keypoint Aware Body Mesh Articulation [79.04090630502782]
We propose an analytical solution to articulate a parametric body model, SMPL, via a set of straightforward geometric transformations.
Our approach offers significantly better alignment to image content when compared to state-of-the-art approaches.
Results on the challenging 3DPW and Human3.6M demonstrate that our approach yields state-of-the-art body mesh fittings.
arXiv Detail & Related papers (2021-04-27T23:01:03Z) - Mesh Graphormer [17.75480888764098]
We present a graph-convolution-reinforced transformer, named Mesh Graphormer, for 3D human pose and mesh reconstruction from a single image.
arXiv Detail & Related papers (2021-04-01T06:16:36Z) - Im2Mesh GAN: Accurate 3D Hand Mesh Recovery from a Single RGB Image [31.371190180801452]
We show that the hand mesh can be learned directly from the input image.
We propose a new type of GAN called Im2Mesh GAN to learn the mesh through end-to-end adversarial training.
arXiv Detail & Related papers (2021-01-27T07:38:01Z) - Neural Mesh Flow: 3D Manifold Mesh Generation via Diffeomorphic Flows [79.39092757515395]
We propose Neural Mesh Flow (NMF) to generate two-manifold meshes for genus-0 shapes.
NMF is a shape auto-encoder consisting of several Neural Ordinary Differential Equation (NODE) blocks that learn accurate mesh geometry by progressively deforming a spherical mesh.
Our experiments demonstrate that NMF facilitates several applications such as single-view mesh reconstruction, global shape parameterization, texture mapping, shape deformation and correspondence.
arXiv Detail & Related papers (2020-07-21T17:45:41Z) - Learning Nonparametric Human Mesh Reconstruction from a Single Image
without Ground Truth Meshes [56.27436157101251]
We propose a novel approach to learn human mesh reconstruction without any ground truth meshes.
This is made possible by introducing two new terms into the loss function of a graph convolutional neural network (Graph CNN)
arXiv Detail & Related papers (2020-02-28T20:30:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.