TeCH: Text-guided Reconstruction of Lifelike Clothed Humans
- URL: http://arxiv.org/abs/2308.08545v2
- Date: Sat, 19 Aug 2023 20:08:54 GMT
- Title: TeCH: Text-guided Reconstruction of Lifelike Clothed Humans
- Authors: Yangyi Huang, Hongwei Yi, Yuliang Xiu, Tingting Liao, Jiaxiang Tang,
Deng Cai, Justus Thies
- Abstract summary: Existing methods often generate overly smooth back-side surfaces with a blurry texture.
Motivated by the power of foundation models, TeCH reconstructs the 3D human by leveraging descriptive text prompts.
We propose a hybrid 3D representation based on DMTet, which consists of an explicit body shape grid and an implicit distance field.
- Score: 35.68114652041377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite recent research advancements in reconstructing clothed humans from a
single image, accurately restoring the "unseen regions" with high-level details
remains an unsolved challenge that lacks attention. Existing methods often
generate overly smooth back-side surfaces with a blurry texture. But how to
effectively capture all visual attributes of an individual from a single image,
which are sufficient to reconstruct unseen areas (e.g., the back view)?
Motivated by the power of foundation models, TeCH reconstructs the 3D human by
leveraging 1) descriptive text prompts (e.g., garments, colors, hairstyles)
which are automatically generated via a garment parsing model and Visual
Question Answering (VQA), 2) a personalized fine-tuned Text-to-Image diffusion
model (T2I) which learns the "indescribable" appearance. To represent
high-resolution 3D clothed humans at an affordable cost, we propose a hybrid 3D
representation based on DMTet, which consists of an explicit body shape grid
and an implicit distance field. Guided by the descriptive prompts +
personalized T2I diffusion model, the geometry and texture of the 3D humans are
optimized through multi-view Score Distillation Sampling (SDS) and
reconstruction losses based on the original observation. TeCH produces
high-fidelity 3D clothed humans with consistent & delicate texture, and
detailed full-body geometry. Quantitative and qualitative experiments
demonstrate that TeCH outperforms the state-of-the-art methods in terms of
reconstruction accuracy and rendering quality. The code will be publicly
available for research purposes at https://huangyangyi.github.io/TeCH
Related papers
- FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis [51.193297565630886]
The challenge of accurately inferring texture remains, particularly in obscured areas such as the back of a person in frontal-view images.
This limitation in texture prediction largely stems from the scarcity of large-scale and diverse 3D datasets.
We propose leveraging extensive 2D fashion datasets to enhance both texture and shape prediction in 3D human digitization.
arXiv Detail & Related papers (2024-10-13T01:25:05Z) - ConTex-Human: Free-View Rendering of Human from a Single Image with
Texture-Consistent Synthesis [49.28239918969784]
We introduce a texture-consistent back view synthesis module that could transfer the reference image content to the back view.
We also propose a visibility-aware patch consistency regularization for texture mapping and refinement combined with the synthesized back view texture.
arXiv Detail & Related papers (2023-11-28T13:55:53Z) - SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion [35.73448283467723]
SiTH is a novel pipeline that integrates an image-conditioned diffusion model into a 3D mesh reconstruction workflow.
We employ a powerful generative diffusion model to hallucinate unseen back-view appearance based on the input images.
For the latter, we leverage skinned body meshes as guidance to recover full-body texture meshes from the input and back-view images.
arXiv Detail & Related papers (2023-11-27T14:22:07Z) - PaintHuman: Towards High-fidelity Text-to-3D Human Texturing via
Denoised Score Distillation [89.09455618184239]
Recent advances in text-to-3D human generation have been groundbreaking.
We propose a model called PaintHuman to address the challenges from two aspects.
We use the depth map as a guidance to ensure realistic semantically aligned textures.
arXiv Detail & Related papers (2023-10-14T00:37:16Z) - Single-Shot Implicit Morphable Faces with Consistent Texture
Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing.
Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z) - Detailed Avatar Recovery from Single Image [50.82102098057822]
This paper presents a novel framework to recover emphdetailed avatar from a single image.
We use the deep neural networks to refine the 3D shape in a Hierarchical Mesh Deformation framework.
Our method can restore detailed human body shapes with complete textures beyond skinned models.
arXiv Detail & Related papers (2021-08-06T03:51:26Z) - OSTeC: One-Shot Texture Completion [86.23018402732748]
We propose an unsupervised approach for one-shot 3D facial texture completion.
The proposed approach rotates an input image in 3D and fill-in the unseen regions by reconstructing the rotated image in a 2D face generator.
We frontalize the target image by projecting the completed texture into the generator.
arXiv Detail & Related papers (2020-12-30T23:53:26Z) - Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction
of Clothed People [36.30755368202957]
We present a novel method to improve the accuracy of the 3D reconstruction of clothed human shape from a single image.
The accuracy and completeness for reconstruction of clothed people is limited due to the large variation in shape resulting from clothing, hair, body size, pose and camera viewpoint.
arXiv Detail & Related papers (2020-09-29T17:18:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.