Reconstructing 3D Human Pose from RGB-D Data with Occlusions
- URL: http://arxiv.org/abs/2310.01228v2
- Date: Sun, 15 Oct 2023 14:48:58 GMT
- Title: Reconstructing 3D Human Pose from RGB-D Data with Occlusions
- Authors: Bowen Dang, Xi Zhao, Bowen Zhang, He Wang
- Abstract summary: We propose a new method to reconstruct the 3D human body from RGB-D images with occlusions.
To reconstruct a semantically and physically plausible human body, we propose to reduce the solution space based on scene information and prior knowledge.
We conducted experiments on the PROX dataset, and the results demonstrate that our method produces more accurate and plausible results compared with other methods.
- Score: 11.677978425905096
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new method to reconstruct the 3D human body from RGB-D images
with occlusions. The foremost challenge is the incompleteness of the RGB-D data
due to occlusions between the body and the environment, leading to implausible
reconstructions that suffer from severe human-scene penetration. To reconstruct
a semantically and physically plausible human body, we propose to reduce the
solution space based on scene information and prior knowledge. Our key idea is
to constrain the solution space of the human body by considering the occluded
body parts and visible body parts separately: modeling all plausible poses
where the occluded body parts do not penetrate the scene, and constraining the
visible body parts using depth data. Specifically, the first component is
realized by a neural network that estimates the candidate region named the
"free zone", a region carved out of the open space within which it is safe to
search for poses of the invisible body parts without concern for penetration.
The second component constrains the visible body parts using the "truncated
shadow volume" of the scanned body point cloud. Furthermore, we propose to use
a volume matching strategy, which yields better performance than surface
matching, to match the human body with the confined region. We conducted
experiments on the PROX dataset, and the results demonstrate that our method
produces more accurate and plausible results compared with other methods.
Related papers
- Kinematics-based 3D Human-Object Interaction Reconstruction from Single View [10.684643503514849]
Existing methods simply predict the body poses merely rely on network training on some indoor datasets.
We propose a kinematics-based method that can drive the joints of human body to the human-object contact regions accurately.
arXiv Detail & Related papers (2024-07-19T05:44:35Z) - Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images [57.479339658504685]
"Divide and Fuse" strategy reconstructs human body parts independently before fusing them.
Human Part Parametric Models (HPPM) independently reconstruct the mesh from a few shape and global-location parameters.
A specially designed fusion module seamlessly integrates the reconstructed parts, even when only a few are visible.
arXiv Detail & Related papers (2024-07-12T21:29:11Z) - AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation [55.179287851188036]
We introduce a novel all-in-one-stage framework, AiOS, for expressive human pose and shape recovery without an additional human detection step.
We first employ a human token to probe a human location in the image and encode global features for each instance.
Then, we introduce a joint-related token to probe the human joint in the image and encoder a fine-grained local feature.
arXiv Detail & Related papers (2024-03-26T17:59:23Z) - Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh
Reconstruction [66.10717041384625]
Zolly is the first 3DHMR method focusing on perspective-distorted images.
We propose a new camera model and a novel 2D representation, termed distortion image, which describes the 2D dense distortion scale of the human body.
We extend two real-world datasets tailored for this task, all containing perspective-distorted human images.
arXiv Detail & Related papers (2023-03-24T04:22:41Z) - BoPR: Body-aware Part Regressor for Human Shape and Pose Estimation [16.38936587088618]
Our proposed method BoPR, the Body-aware Part Regressor, first extracts features of both the body and part regions using an attention-guided mechanism.
We then utilize these features to encode extra part-body dependency for per-part regression, with part features as queries and body feature as a reference.
arXiv Detail & Related papers (2023-03-21T08:36:59Z) - Adjustable Method Based on Body Parts for Improving the Accuracy of 3D
Reconstruction in Visually Important Body Parts from Silhouettes [4.378411442784295]
This research proposes a novel adjustable algorithm for reconstructing 3D body shapes from front and side silhouettes.
We first recognize the correspondent body parts using body segmentation in both views.
Then, we align individual body parts by 2D rigid registration and match them using pairwise matching.
arXiv Detail & Related papers (2022-11-27T13:25:02Z) - UNIF: United Neural Implicit Functions for Clothed Human Reconstruction
and Animation [53.2018423391591]
We propose a part-based method for clothed human reconstruction and animation with raw scans and skeletons as the input.
Our method learns to separate parts from body motions instead of part supervision, thus can be extended to clothed humans and other articulated objects.
arXiv Detail & Related papers (2022-07-20T11:41:29Z) - Total Scale: Face-to-Body Detail Reconstruction from Sparse RGBD Sensors [52.38220261632204]
Flat facial surfaces frequently occur in the PIFu-based reconstruction results.
We propose a two-scale PIFu representation to enhance the quality of the reconstructed facial details.
Experiments demonstrate the effectiveness of our approach in vivid facial details and deforming body shapes.
arXiv Detail & Related papers (2021-12-03T18:46:49Z) - Multimodal In-bed Pose and Shape Estimation under the Blankets [77.12439296395733]
We propose a pyramid scheme to fuse different modalities in a way that best leverages the knowledge captured by the multimodal sensors.
We employ an attention-based reconstruction module to generate uncovered modalities, which are further fused to update current estimation.
arXiv Detail & Related papers (2020-12-12T05:35:23Z) - Reposing Humans by Warping 3D Features [18.688568898013482]
We propose to implicitly learn a dense feature volume from human images.
The volume is mapped back to RGB space by a convolutional decoder.
Our state-of-the-art results on the DeepFashion and the iPER benchmarks indicate that dense volumetric human representations are worth investigating.
arXiv Detail & Related papers (2020-06-08T19:31:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.