D$^3$-Human: Dynamic Disentangled Digital Human from Monocular Video
- URL: http://arxiv.org/abs/2501.01589v1
- Date: Fri, 03 Jan 2025 00:58:35 GMT
- Title: D$^3$-Human: Dynamic Disentangled Digital Human from Monocular Video
- Authors: Honghu Chen, Bo Peng, Yunfan Tao, Juyong Zhang,
- Abstract summary: We introduce D$3$-Human, a method for reconstructing Dynamic Disentangled Digital Human geometry from monocular videos.<n>We reconstruct the visible region as SDF and propose a novel human manifold signed distance field (hmSDF) to segment the visible clothing and visible body, and then merge the visible and invisible body.
- Score: 26.879355799115743
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce D$^3$-Human, a method for reconstructing Dynamic Disentangled Digital Human geometry from monocular videos. Past monocular video human reconstruction primarily focuses on reconstructing undecoupled clothed human bodies or only reconstructing clothing, making it difficult to apply directly in applications such as animation production. The challenge in reconstructing decoupled clothing and body lies in the occlusion caused by clothing over the body. To this end, the details of the visible area and the plausibility of the invisible area must be ensured during the reconstruction process. Our proposed method combines explicit and implicit representations to model the decoupled clothed human body, leveraging the robustness of explicit representations and the flexibility of implicit representations. Specifically, we reconstruct the visible region as SDF and propose a novel human manifold signed distance field (hmSDF) to segment the visible clothing and visible body, and then merge the visible and invisible body. Extensive experimental results demonstrate that, compared with existing reconstruction schemes, D$^3$-Human can achieve high-quality decoupled reconstruction of the human body wearing different clothing, and can be directly applied to clothing transfer and animation.
Related papers
- DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image [49.69224401751216]
Most existing methods of 3D clothed human reconstruction from a single image treat the clothed human as a single object without distinguishing between cloth and human body.
We present DeClotH, which separately reconstructs 3D cloth and human body from a single image.
arXiv Detail & Related papers (2025-03-25T06:00:15Z) - WonderHuman: Hallucinating Unseen Parts in Dynamic 3D Human Reconstruction [51.22641018932625]
We present WonderHuman to reconstruct dynamic human avatars from a monocular video for high-fidelity novel view synthesis.
Our method achieves SOTA performance in producing photorealistic renderings from the given monocular video.
arXiv Detail & Related papers (2025-02-03T04:43:41Z) - DressRecon: Freeform 4D Human Reconstruction from Monocular Video [64.61230035671885]
We present a method to reconstruct time-consistent human body models from monocular videos.
We focus on extremely loose clothing or handheld object interactions.
DressRecon yields higher-fidelity 3D reconstructions than prior art.
arXiv Detail & Related papers (2024-09-30T17:59:15Z) - ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild [33.7726643918619]
ReLoo reconstructs high-quality 3D models of humans dressed in loose garments from monocular in-the-wild videos.
We first establish a layered neural human representation that decomposes clothed humans into a neural inner body and outer clothing.
A global optimization jointly optimize the shape, appearance, and deformations of the human body and clothing via multi-layer differentiable volume rendering.
arXiv Detail & Related papers (2024-09-23T17:58:39Z) - DLCA-Recon: Dynamic Loose Clothing Avatar Reconstruction from Monocular
Videos [15.449755248457457]
We propose a method named DLCA-Recon to create human avatars from monocular videos.
The distance from loose clothing to the underlying body rapidly changes in every frame when the human freely moves and acts.
Our method can produce superior results for humans with loose clothing compared to the SOTA methods.
arXiv Detail & Related papers (2023-12-19T12:19:20Z) - Relightable Neural Actor with Intrinsic Decomposition and Pose Control [80.06094206522668]
We propose Relightable Neural Actor, a new video-based method for learning a pose-driven neural human model that can be relighted.
For training, our method solely requires a multi-view recording of the human under a known, but static lighting condition.
To evaluate our approach in real-world scenarios, we collect a new dataset with four identities recorded under different light conditions, indoors and outdoors.
arXiv Detail & Related papers (2023-12-18T14:30:13Z) - SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion [35.73448283467723]
SiTH is a novel pipeline that integrates an image-conditioned diffusion model into a 3D mesh reconstruction workflow.
We employ a powerful generative diffusion model to hallucinate unseen back-view appearance based on the input images.
For the latter, we leverage skinned body meshes as guidance to recover full-body texture meshes from the input and back-view images.
arXiv Detail & Related papers (2023-11-27T14:22:07Z) - Reconstructing 3D Human Pose from RGB-D Data with Occlusions [11.677978425905096]
We propose a new method to reconstruct the 3D human body from RGB-D images with occlusions.
To reconstruct a semantically and physically plausible human body, we propose to reduce the solution space based on scene information and prior knowledge.
We conducted experiments on the PROX dataset, and the results demonstrate that our method produces more accurate and plausible results compared with other methods.
arXiv Detail & Related papers (2023-10-02T14:16:13Z) - Capturing and Animation of Body and Clothing from Monocular Video [105.87228128022804]
We present SCARF, a hybrid model combining a mesh-based body with a neural radiance field.
integrating the mesh into the rendering enables us to optimize SCARF directly from monocular videos.
We demonstrate that SCARFs clothing with higher visual quality than existing methods, that the clothing deforms with changing body pose and body shape, and that clothing can be successfully transferred between avatars of different subjects.
arXiv Detail & Related papers (2022-10-04T19:34:05Z) - Style and Pose Control for Image Synthesis of Humans from a Single
Monocular View [78.6284090004218]
StylePoseGAN is a non-controllable generator to accept conditioning of pose and appearance separately.
Our network can be trained in a fully supervised way with human images to disentangle pose, appearance and body parts.
StylePoseGAN achieves state-of-the-art image generation fidelity on common perceptual metrics.
arXiv Detail & Related papers (2021-02-22T18:50:47Z) - Deep Physics-aware Inference of Cloth Deformation for Monocular Human
Performance Capture [84.73946704272113]
We show how integrating physics into the training process improves the learned cloth deformations and allows modeling clothing as a separate piece of geometry.
Our approach leads to a significant improvement over current state-of-the-art methods and is thus a clear step towards realistic monocular capture of the entire deforming surface of a human clothed.
arXiv Detail & Related papers (2020-11-25T16:46:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.