Learning 3D Human Shape and Pose from Dense Body Parts
- URL: http://arxiv.org/abs/1912.13344v2
- Date: Sun, 6 Dec 2020 10:46:53 GMT
- Title: Learning 3D Human Shape and Pose from Dense Body Parts
- Authors: Hongwen Zhang, Jie Cao, Guo Lu, Wanli Ouyang, Zhenan Sun
- Abstract summary: We propose a Decompose-and-aggregate Network (DaNet) to learn 3D human shape and pose from dense correspondences of body parts.
Messages from local streams are aggregated to enhance the robust prediction of the rotation-based poses.
Our method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW.
- Score: 117.46290013548533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstructing 3D human shape and pose from monocular images is challenging
despite the promising results achieved by the most recent learning-based
methods. The commonly occurred misalignment comes from the facts that the
mapping from images to the model space is highly non-linear and the
rotation-based pose representation of body models is prone to result in the
drift of joint positions. In this work, we investigate learning 3D human shape
and pose from dense correspondences of body parts and propose a
Decompose-and-aggregate Network (DaNet) to address these issues. DaNet adopts
the dense correspondence maps, which densely build a bridge between 2D pixels
and 3D vertices, as intermediate representations to facilitate the learning of
2D-to-3D mapping. The prediction modules of DaNet are decomposed into one
global stream and multiple local streams to enable global and fine-grained
perceptions for the shape and pose predictions, respectively. Messages from
local streams are further aggregated to enhance the robust prediction of the
rotation-based poses, where a position-aided rotation feature refinement
strategy is proposed to exploit spatial relationships between body joints.
Moreover, a Part-based Dropout (PartDrop) strategy is introduced to drop out
dense information from intermediate representations during training,
encouraging the network to focus on more complementary body parts as well as
neighboring position features. The efficacy of the proposed method is validated
on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and
3DPW, showing that our method could significantly improve the reconstruction
performance in comparison with previous state-of-the-art methods. Our code is
publicly available at https://hongwenzhang.github.io/dense2mesh .
Related papers
- Iterative Graph Filtering Network for 3D Human Pose Estimation [5.177947445379688]
Graph convolutional networks (GCNs) have proven to be an effective approach for 3D human pose estimation.
In this paper, we introduce an iterative graph filtering framework for 3D human pose estimation.
Our approach builds upon the idea of iteratively solving graph filtering with Laplacian regularization.
arXiv Detail & Related papers (2023-07-29T20:46:44Z) - Sampling is Matter: Point-guided 3D Human Mesh Reconstruction [0.0]
This paper presents a simple yet powerful method for 3D human mesh reconstruction from a single RGB image.
Experimental results on benchmark datasets show that the proposed method efficiently improves the performance of 3D human mesh reconstruction.
arXiv Detail & Related papers (2023-04-19T08:45:26Z) - 3D Surface Reconstruction in the Wild by Deforming Shape Priors from
Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem.
We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image.
Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z) - Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence
Learning [70.75369367311897]
3D-aware global correspondences are reliable flows that jointly encode global semantic correlations, local deformations, and geometric priors of 3D human bodies.
An adversarial generator takes the garment warped by the 3D-aware flow, and the image of the target person as inputs, to synthesize the photo-realistic try-on result.
arXiv Detail & Related papers (2022-11-25T12:16:21Z) - KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D
Correspondences [77.56222946832237]
We present a novel framework to detect the densepose of multiple people in an image.
The proposed method, which we refer to Knowledge Transfer Network (KTN), tackles two main problems.
It simultaneously maintains feature resolution and suppresses background pixels, and this strategy results in substantial increase in accuracy.
arXiv Detail & Related papers (2022-06-21T03:11:37Z) - NeuralReshaper: Single-image Human-body Retouching with Deep Neural
Networks [50.40798258968408]
We present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks.
Our approach follows a fit-then-reshape pipeline, which first fits a parametric 3D human model to a source human image.
To deal with the lack-of-data problem that no paired data exist, we introduce a novel self-supervised strategy to train our network.
arXiv Detail & Related papers (2022-03-20T09:02:13Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Learning Transferable Kinematic Dictionary for 3D Human Pose and Shape
Reconstruction [15.586347115568973]
We propose a kinematic dictionary, which explicitly regularizes the solution space of relative 3D rotations of human joints.
Our method achieves end-to-end 3D reconstruction without the need of using any shape annotations during the training of neural networks.
The proposed method achieves competitive results on large-scale datasets including Human3.6M, MPI-INF-3DHP, and LSP.
arXiv Detail & Related papers (2021-04-02T09:24:29Z) - Monocular Human Pose and Shape Reconstruction using Part Differentiable
Rendering [53.16864661460889]
Recent works succeed in regression-based methods which estimate parametric models directly through a deep neural network supervised by 3D ground truth.
In this paper, we introduce body segmentation as critical supervision.
To improve the reconstruction with part segmentation, we propose a part-level differentiable part that enables part-based models to be supervised by part segmentation.
arXiv Detail & Related papers (2020-03-24T14:25:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.