PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular
Images
- URL: http://arxiv.org/abs/2207.06400v3
- Date: Fri, 28 Apr 2023 02:33:10 GMT
- Title: PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular
Images
- Authors: Hongwen Zhang, Yating Tian, Yuxiang Zhang, Mengcheng Li, Liang An,
Zhenan Sun, Yebin Liu
- Abstract summary: PyMAF-X is a regression-based approach to recovering parametric full-body models from monocular images.
PyMAF and PyMAF-X effectively improve the mesh-image alignment and achieve new state-of-the-art results.
- Score: 60.33197938330409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present PyMAF-X, a regression-based approach to recovering parametric
full-body models from monocular images. This task is very challenging since
minor parametric deviation may lead to noticeable misalignment between the
estimated mesh and the input image. Moreover, when integrating part-specific
estimations into the full-body model, existing solutions tend to either degrade
the alignment or produce unnatural wrist poses. To address these issues, we
propose a Pyramidal Mesh Alignment Feedback (PyMAF) loop in our regression
network for well-aligned human mesh recovery and extend it as PyMAF-X for the
recovery of expressive full-body models. The core idea of PyMAF is to leverage
a feature pyramid and rectify the predicted parameters explicitly based on the
mesh-image alignment status. Specifically, given the currently predicted
parameters, mesh-aligned evidence will be extracted from finer-resolution
features accordingly and fed back for parameter rectification. To enhance the
alignment perception, an auxiliary dense supervision is employed to provide
mesh-image correspondence guidance while spatial alignment attention is
introduced to enable the awareness of the global contexts for our network. When
extending PyMAF for full-body mesh recovery, an adaptive integration strategy
is proposed in PyMAF-X to produce natural wrist poses while maintaining the
well-aligned performance of the part-specific estimations. The efficacy of our
approach is validated on several benchmark datasets for body, hand, face, and
full-body mesh recovery, where PyMAF and PyMAF-X effectively improve the
mesh-image alignment and achieve new state-of-the-art results. The project page
with code and video results can be found at https://www.liuyebin.com/pymaf-x.
Related papers
- Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection [1.0358639819750703]
In unsupervised anomaly detection (UAD) research, it is necessary to develop a computationally efficient and scalable solution.
We revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses.
We propose Feature Attenuation of Defective Representation (FADeR) that only employs two layers which attenuates feature information of anomaly reconstruction.
arXiv Detail & Related papers (2024-07-05T15:44:53Z) - PO-MSCKF: An Efficient Visual-Inertial Odometry by Reconstructing the Multi-State Constrained Kalman Filter with the Pose-only Theory [0.0]
Visual-Inertial Odometry (VIO) is crucial for payload-constrained robots.
We propose to reconstruct the MSCKF VIO with the novel Pose-Only (PO) multi-view geometry description.
New filter does not require any feature position information, which removes the computational cost and linearization errors.
arXiv Detail & Related papers (2024-07-02T02:18:35Z) - Delving Deep into Pixel Alignment Feature for Accurate Multi-view Human
Mesh Recovery [37.57922952189394]
We present Pixel-aligned Feedback Fusion (PaFF) for accurate yet efficient human mesh recovery from multi-view images.
PaFF is an iterative regression framework that performs feature extraction and fusion alternately.
The efficacy of our method is validated in the Human3.6M dataset via comprehensive ablation experiments.
arXiv Detail & Related papers (2023-01-15T05:31:52Z) - Back to MLP: A Simple Baseline for Human Motion Prediction [59.18776744541904]
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences.
We show that the performance of these approaches can be surpassed by a light-weight and purely architectural architecture with only 0.14M parameters.
An exhaustive evaluation on Human3.6M, AMASS and 3DPW datasets shows that our method, which we dub siMLPe, consistently outperforms all other approaches.
arXiv Detail & Related papers (2022-07-04T16:35:58Z) - Poseur: Direct Human Pose Regression with Transformers [119.79232258661995]
We propose a direct, regression-based approach to 2D human pose estimation from single images.
Our framework is end-to-end differentiable, and naturally learns to exploit the dependencies between keypoints.
Ours is the first regression-based approach to perform favorably compared to the best heatmap-based pose estimation methods.
arXiv Detail & Related papers (2022-01-19T04:31:57Z) - A Lightweight Graph Transformer Network for Human Mesh Reconstruction
from 2D Human Pose [8.816462200869445]
We present GTRS, a pose-based method that can reconstruct human mesh from 2D human pose.
We demonstrate the efficiency and generalization of GTRS by extensive evaluations on the Human3.6M and 3DPW datasets.
arXiv Detail & Related papers (2021-11-24T18:48:03Z) - Coarse-to-fine Animal Pose and Shape Estimation [67.39635503744395]
We propose a coarse-to-fine approach to reconstruct 3D animal mesh from a single image.
The coarse estimation stage first estimates the pose, shape and translation parameters of the SMAL model.
The estimated meshes are then used as a starting point by a graph convolutional network (GCN) to predict a per-vertex deformation in the refinement stage.
arXiv Detail & Related papers (2021-11-16T01:27:20Z) - 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment
Feedback Loop [128.07841893637337]
Regression-based methods have recently shown promising results in reconstructing human meshes from monocular images.
Minor deviation in parameters may lead to noticeable misalignment between the estimated meshes and image evidences.
We propose a Pyramidal Mesh Alignment Feedback (PyMAF) loop to leverage a feature pyramid and rectify the predicted parameters.
arXiv Detail & Related papers (2021-03-30T17:07:49Z) - AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in
the Wild [77.43884383743872]
We present AdaFuse, an adaptive multiview fusion method to enhance the features in occluded views.
We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic.
We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints.
arXiv Detail & Related papers (2020-10-26T03:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.