Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images
- URL: http://arxiv.org/abs/2407.09694v1
- Date: Fri, 12 Jul 2024 21:29:11 GMT
- Title: Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images
- Authors: Tianyu Luan, Zhongpai Gao, Luyuan Xie, Abhishek Sharma, Hao Ding, Benjamin Planche, Meng Zheng, Ange Lou, Terrence Chen, Junsong Yuan, Ziyan Wu,
- Abstract summary: "Divide and Fuse" strategy reconstructs human body parts independently before fusing them.
Human Part Parametric Models (HPPM) independently reconstruct the mesh from a few shape and global-location parameters.
A specially designed fusion module seamlessly integrates the reconstructed parts, even when only a few are visible.
- Score: 57.479339658504685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel bottom-up approach for human body mesh reconstruction, specifically designed to address the challenges posed by partial visibility and occlusion in input images. Traditional top-down methods, relying on whole-body parametric models like SMPL, falter when only a small part of the human is visible, as they require visibility of most of the human body for accurate mesh reconstruction. To overcome this limitation, our method employs a "Divide and Fuse (D&F)" strategy, reconstructing human body parts independently before fusing them, thereby ensuring robustness against occlusions. We design Human Part Parametric Models (HPPM) that independently reconstruct the mesh from a few shape and global-location parameters, without inter-part dependency. A specially designed fusion module then seamlessly integrates the reconstructed parts, even when only a few are visible. We harness a large volume of ground-truth SMPL data to train our parametric mesh models. To facilitate the training and evaluation of our method, we have established benchmark datasets featuring images of partially visible humans with HPPM annotations. Our experiments, conducted on these benchmark datasets, demonstrate the effectiveness of our D&F method, particularly in scenarios with substantial invisibility, where traditional approaches struggle to maintain reconstruction quality.
Related papers
- PAFormer: Part Aware Transformer for Person Re-identification [3.8004980982852214]
We introduce textbfPart Aware Transformer (PAFormer), a pose estimation based ReID model which can perform precise part-to-part comparison.
Our method outperforms existing approaches on well-known ReID benchmark datasets.
arXiv Detail & Related papers (2024-08-12T04:46:55Z) - AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation [55.179287851188036]
We introduce a novel all-in-one-stage framework, AiOS, for expressive human pose and shape recovery without an additional human detection step.
We first employ a human token to probe a human location in the image and encode global features for each instance.
Then, we introduce a joint-related token to probe the human joint in the image and encoder a fine-grained local feature.
arXiv Detail & Related papers (2024-03-26T17:59:23Z) - Template-Free Single-View 3D Human Digitalization with Diffusion-Guided LRM [29.13412037370585]
We present Human-LRM, a diffusion-guided feed-forward model that predicts the implicit field of a human from a single image.
Our method is able to capture human without any template prior, e.g., SMPL, and effectively enhance occluded parts with rich and realistic details.
arXiv Detail & Related papers (2024-01-22T18:08:22Z) - SeSDF: Self-evolved Signed Distance Field for Implicit 3D Clothed Human
Reconstruction [23.89884587074109]
We address the problem of clothed human reconstruction from a single image or uncalibrated multi-view images.
We propose a flexible framework which, by leveraging the parametric SMPL-X model, can take an arbitrary number of input images to reconstruct a clothed human model under an uncalibrated setting.
arXiv Detail & Related papers (2023-04-01T16:58:19Z) - Multi-view Human Body Mesh Translator [20.471741894219228]
We present a novel textbfMulti-view human body textbfMesh textbfTranslator (MMT) model for estimating human body mesh.
MMT fuses features of different views in both encoding and decoding phases, leading to representations embedded with global information.
arXiv Detail & Related papers (2022-10-04T20:10:59Z) - UNIF: United Neural Implicit Functions for Clothed Human Reconstruction
and Animation [53.2018423391591]
We propose a part-based method for clothed human reconstruction and animation with raw scans and skeletons as the input.
Our method learns to separate parts from body motions instead of part supervision, thus can be extended to clothed humans and other articulated objects.
arXiv Detail & Related papers (2022-07-20T11:41:29Z) - 3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous
Image Data [77.57798334776353]
We consider the problem of obtaining dense 3D reconstructions of humans from single and partially occluded views.
We suggest that ambiguities can be modelled more effectively by parametrizing the possible body shapes and poses.
We show that our method outperforms alternative approaches in ambiguous pose recovery on standard benchmarks for 3D humans.
arXiv Detail & Related papers (2020-11-02T13:55:31Z) - Monocular Human Pose and Shape Reconstruction using Part Differentiable
Rendering [53.16864661460889]
Recent works succeed in regression-based methods which estimate parametric models directly through a deep neural network supervised by 3D ground truth.
In this paper, we introduce body segmentation as critical supervision.
To improve the reconstruction with part segmentation, we propose a part-level differentiable part that enables part-based models to be supervised by part segmentation.
arXiv Detail & Related papers (2020-03-24T14:25:46Z) - HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose
and Shape Estimation [60.35776484235304]
This work attempts to address the uncertainty of lifting the detected 2D joints to the 3D space by introducing an intermediate state-Part-Centric Heatmap Triplets (HEMlets)
The HEMlets utilize three joint-heatmaps to represent the relative depth information of the end-joints for each skeletal body part.
A Convolutional Network (ConvNet) is first trained to predict HEMlets from the input image, followed by a volumetric joint-heatmap regression.
arXiv Detail & Related papers (2020-03-10T04:03:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.