Pipeline for 3D reconstruction of the human body from AR/VR headset
mounted egocentric cameras
- URL: http://arxiv.org/abs/2111.05409v1
- Date: Tue, 9 Nov 2021 20:38:32 GMT
- Title: Pipeline for 3D reconstruction of the human body from AR/VR headset
mounted egocentric cameras
- Authors: Shivam Grover, Kshitij Sidana and Vanita Jain
- Abstract summary: We propose a novel pipeline for the 3D reconstruction of the full body from egocentric viewpoints.
We first make use of conditional GANs to translate the egocentric views to full body third-person views.
The generated mesh has fairly realistic body proportions and is fully rigged allowing for further applications.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we propose a novel pipeline for the 3D reconstruction of the
full body from egocentric viewpoints. 3-D reconstruction of the human body from
egocentric viewpoints is a challenging task as the view is skewed and the body
parts farther from the cameras are occluded. One such example is the view from
cameras installed below VR headsets. To achieve this task, we first make use of
conditional GANs to translate the egocentric views to full body third-person
views. This increases the comprehensibility of the image and caters to
occlusions. The generated third-person view is further sent through the 3D
reconstruction module that generates a 3D mesh of the body. We also train a
network that can take the third person full-body view of the subject and
generate the texture maps for applying on the mesh. The generated mesh has
fairly realistic body proportions and is fully rigged allowing for further
applications such as real-time animation and pose transfer in games. This
approach can be key to a new domain of mobile human telepresence.
Related papers
- Synthesizing Moving People with 3D Control [88.68284137105654]
We present a diffusion model-based framework for animating people from a single image for a given target 3D motion sequence.
For the first part, we learn an in-filling diffusion model to hallucinate unseen parts of a person given a single image.
Second, we develop a diffusion-based rendering pipeline, which is controlled by 3D human poses.
arXiv Detail & Related papers (2024-01-19T18:59:11Z) - SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion [35.73448283467723]
SiTH is a novel pipeline that integrates an image-conditioned diffusion model into a 3D mesh reconstruction workflow.
We employ a powerful generative diffusion model to hallucinate unseen back-view appearance based on the input images.
For the latter, we leverage skinned body meshes as guidance to recover full-body texture meshes from the input and back-view images.
arXiv Detail & Related papers (2023-11-27T14:22:07Z) - Humans in 4D: Reconstructing and Tracking Humans with Transformers [72.50856500760352]
We present an approach to reconstruct humans and track them over time.
At the core of our approach, we propose a fully "transformerized" version of a network for human mesh recovery.
This network, HMR 2.0, advances the state of the art and shows the capability to analyze unusual poses that have in the past been difficult to reconstruct from single images.
arXiv Detail & Related papers (2023-05-31T17:59:52Z) - Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion [67.71624118802411]
We present Farm3D, a method for learning category-specific 3D reconstructors for articulated objects.
We propose a framework that uses an image generator, such as Stable Diffusion, to generate synthetic training data.
Our network can be used for analysis, including monocular reconstruction, or for synthesis, generating articulated assets for real-time applications such as video games.
arXiv Detail & Related papers (2023-04-20T17:59:34Z) - SHERF: Generalizable Human NeRF from a Single Image [59.10589479808622]
SHERF is the first generalizable Human NeRF model for recovering animatable 3D humans from a single input image.
We propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.
arXiv Detail & Related papers (2023-03-22T17:59:12Z) - 3DHumanGAN: 3D-Aware Human Image Generation with 3D Pose Mapping [37.14866512377012]
3DHumanGAN is a 3D-aware generative adversarial network that synthesizes photorealistic images of full-body humans.
We propose a novel generator architecture in which a 2D convolutional backbone is modulated by a 3D pose mapping network.
arXiv Detail & Related papers (2022-12-14T17:59:03Z) - Structured 3D Features for Reconstructing Controllable Avatars [43.36074729431982]
We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface.
We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation.
arXiv Detail & Related papers (2022-12-13T18:57:33Z) - Inferring Implicit 3D Representations from Human Figures on Pictorial
Maps [1.0499611180329804]
We present an automated workflow to bring human figures, one of the most frequently appearing entities on pictorial maps, to the third dimension.
We first let a network consisting of fully connected layers estimate the depth coordinate of 2D pose points.
The gained 3D pose points are inputted together with 2D masks of body parts into a deep implicit surface network to infer 3D signed distance fields.
arXiv Detail & Related papers (2022-08-30T19:29:18Z) - EgoRenderer: Rendering Human Avatars from Egocentric Camera Images [87.96474006263692]
We present EgoRenderer, a system for rendering full-body neural avatars of a person captured by a wearable, egocentric fisheye camera.
Rendering full-body avatars from such egocentric images come with unique challenges due to the top-down view and large distortions.
We tackle these challenges by decomposing the rendering process into several steps, including texture synthesis, pose construction, and neural image translation.
arXiv Detail & Related papers (2021-11-24T18:33:02Z) - SelfPose: 3D Egocentric Pose Estimation from a Headset Mounted Camera [97.0162841635425]
We present a solution to egocentric 3D body pose estimation from monocular images captured from downward looking fish-eye cameras installed on the rim of a head mounted VR device.
This unusual viewpoint leads to images with unique visual appearance, with severe self-occlusions and perspective distortions.
We propose an encoder-decoder architecture with a novel multi-branch decoder designed to account for the varying uncertainty in 2D predictions.
arXiv Detail & Related papers (2020-11-02T16:18:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.