Delving Deep into Pixel Alignment Feature for Accurate Multi-view Human
Mesh Recovery
- URL: http://arxiv.org/abs/2301.06020v1
- Date: Sun, 15 Jan 2023 05:31:52 GMT
- Title: Delving Deep into Pixel Alignment Feature for Accurate Multi-view Human
Mesh Recovery
- Authors: Kai Jia, Hongwen Zhang, Liang An, Yebin Liu
- Abstract summary: We present Pixel-aligned Feedback Fusion (PaFF) for accurate yet efficient human mesh recovery from multi-view images.
PaFF is an iterative regression framework that performs feature extraction and fusion alternately.
The efficacy of our method is validated in the Human3.6M dataset via comprehensive ablation experiments.
- Score: 37.57922952189394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regression-based methods have shown high efficiency and effectiveness for
multi-view human mesh recovery. The key components of a typical regressor lie
in the feature extraction of input views and the fusion of multi-view features.
In this paper, we present Pixel-aligned Feedback Fusion (PaFF) for accurate yet
efficient human mesh recovery from multi-view images. PaFF is an iterative
regression framework that performs feature extraction and fusion alternately.
At each iteration, PaFF extracts pixel-aligned feedback features from each
input view according to the reprojection of the current estimation and fuses
them together with respect to each vertex of the downsampled mesh. In this way,
our regressor can not only perceive the misalignment status of each view from
the feedback features but also correct the mesh parameters more effectively
based on the feature fusion on mesh vertices. Additionally, our regressor
disentangles the global orientation and translation of the body mesh from the
estimation of mesh parameters such that the camera parameters of input views
can be better utilized in the regression process. The efficacy of our method is
validated in the Human3.6M dataset via comprehensive ablation experiments,
where PaFF achieves 33.02 MPJPE and brings significant improvements over the
previous best solutions by more than 29%. The project page with code and video
results can be found at https://kairobo.github.io/PaFF/.
Related papers
- VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos [24.310673998221866]
We propose VisFusion, a visibility-aware online 3D scene reconstruction approach from posed monocular videos.
We aim to improve the feature fusion by explicitly inferring its visibility from a similarity matrix.
Experimental results on benchmarks show that our method can achieve superior performance with more scene details.
arXiv Detail & Related papers (2023-04-21T00:47:05Z) - PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular
Images [60.33197938330409]
PyMAF-X is a regression-based approach to recovering parametric full-body models from monocular images.
PyMAF and PyMAF-X effectively improve the mesh-image alignment and achieve new state-of-the-art results.
arXiv Detail & Related papers (2022-07-13T17:58:33Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression [81.05772887221333]
We study the dense keypoint regression framework that is previously inferior to the keypoint detection and grouping framework.
We present a simple yet effective approach, named disentangled keypoint regression (DEKR)
We empirically show that the proposed direct regression method outperforms keypoint detection and grouping methods.
arXiv Detail & Related papers (2021-04-06T05:54:46Z) - 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment
Feedback Loop [128.07841893637337]
Regression-based methods have recently shown promising results in reconstructing human meshes from monocular images.
Minor deviation in parameters may lead to noticeable misalignment between the estimated meshes and image evidences.
We propose a Pyramidal Mesh Alignment Feedback (PyMAF) loop to leverage a feature pyramid and rectify the predicted parameters.
arXiv Detail & Related papers (2021-03-30T17:07:49Z) - AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in
the Wild [77.43884383743872]
We present AdaFuse, an adaptive multiview fusion method to enhance the features in occluded views.
We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic.
We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints.
arXiv Detail & Related papers (2020-10-26T03:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.