TELA: Text to Layer-wise 3D Clothed Human Generation
- URL: http://arxiv.org/abs/2404.16748v1
- Date: Thu, 25 Apr 2024 17:05:38 GMT
- Title: TELA: Text to Layer-wise 3D Clothed Human Generation
- Authors: Junting Dong, Qi Fang, Zehuan Huang, Xudong Xu, Jingbo Wang, Sida Peng, Bo Dai,
- Abstract summary: This paper addresses the task of 3D clothed human generation from textural descriptions.
We propose a layer-wise clothed human representation combined with a progressive optimization strategy.
Our approach achieves state-of-the-art 3D clothed human generation while also supporting cloth editing applications.
- Score: 27.93447899876341
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the task of 3D clothed human generation from textural descriptions. Previous works usually encode the human body and clothes as a holistic model and generate the whole model in a single-stage optimization, which makes them struggle for clothing editing and meanwhile lose fine-grained control over the whole generation process. To solve this, we propose a layer-wise clothed human representation combined with a progressive optimization strategy, which produces clothing-disentangled 3D human models while providing control capacity for the generation process. The basic idea is progressively generating a minimal-clothed human body and layer-wise clothes. During clothing generation, a novel stratified compositional rendering method is proposed to fuse multi-layer human models, and a new loss function is utilized to help decouple the clothing model from the human body. The proposed method achieves high-quality disentanglement, which thereby provides an effective way for 3D garment generation. Extensive experiments demonstrate that our approach achieves state-of-the-art 3D clothed human generation while also supporting cloth editing applications such as virtual try-on. Project page: http://jtdong.com/tela_layer/
Related papers
- Layered 3D Human Generation via Semantic-Aware Diffusion Model [63.459666003261276]
We propose a text-driven layered 3D human generation framework based on a novel semantic-aware diffusion model.
To keep the generated clothing consistent with the target text, we propose a semantic-confidence strategy for clothing.
To match the clothing with different body shapes, we propose a SMPL-driven implicit field deformation network.
arXiv Detail & Related papers (2023-12-10T07:34:43Z) - Deceptive-Human: Prompt-to-NeRF 3D Human Generation with 3D-Consistent
Synthetic Images [67.31920821192323]
Deceptive-Human is a novel framework capitalizing state-of-the-art control diffusion models (e.g., ControlNet) to generate a high-quality controllable 3D human NeRF.
Our method is versatile and readily accommodating, including a text prompt and additional data such as 3D mesh, poses, and seed images.
The resulting 3D human NeRF model empowers the synthesis of highly photorealistic views from 360-degree perspectives.
arXiv Detail & Related papers (2023-11-27T15:49:41Z) - HumanLiff: Layer-wise 3D Human Generation with Diffusion Model [55.891036415316876]
Existing 3D human generative models mainly generate a clothed 3D human as an undetectable 3D model in a single pass.
We propose HumanLiff, the first layer-wise 3D human generative model with a unified diffusion process.
arXiv Detail & Related papers (2023-08-18T17:59:04Z) - Text-guided 3D Human Generation from 2D Collections [69.04031635550294]
We introduce Text-guided 3D Human Generation (texttT3H), where a model is to generate a 3D human, guided by the fashion description.
CCH adopts cross-modal attention to fuse compositional human rendering with the extracted fashion semantics.
We conduct evaluations on DeepFashion and SHHQ with diverse fashion attributes covering the shape, fabric, and color of upper and lower clothing.
arXiv Detail & Related papers (2023-05-23T17:50:15Z) - PERGAMO: Personalized 3D Garments from Monocular Video [6.8338761008826445]
PERGAMO is a data-driven approach to learn a deformable model for 3D garments from monocular images.
We first introduce a novel method to reconstruct the 3D geometry of garments from a single image, and use it to build a dataset of clothing from monocular videos.
We show that our method is capable of producing garment animations that match the real-world behaviour, and generalizes to unseen body motions extracted from motion capture dataset.
arXiv Detail & Related papers (2022-10-26T21:15:54Z) - gDNA: Towards Generative Detailed Neural Avatars [94.9804106939663]
We show that our model is able to generate natural human avatars wearing diverse and detailed clothing.
Our method can be used on the task of fitting human models to raw scans, outperforming the previous state-of-the-art.
arXiv Detail & Related papers (2022-01-11T18:46:38Z) - Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction
from Single Images [50.34202789543989]
Deep Fashion3D is the largest collection to date of 3D garment models.
It provides rich annotations including 3D feature lines, 3D body pose and the corresponded multi-view real images.
A novel adaptable template is proposed to enable the learning of all types of clothing in a single network.
arXiv Detail & Related papers (2020-03-28T09:20:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.