Tailor: An Integrated Text-Driven CG-Ready Human and Garment Generation System
- URL: http://arxiv.org/abs/2503.12052v2
- Date: Tue, 18 Mar 2025 06:08:49 GMT
- Title: Tailor: An Integrated Text-Driven CG-Ready Human and Garment Generation System
- Authors: Zhiyao Sun, Yu-Hui Wen, Matthieu Lin, Ho-Jui Fang, Sheng Ye, Tian Lv, Yong-Jin Liu,
- Abstract summary: Tailor is an integrated text-to-avatar system that generates high-fidelity, customizable 3D humans with simulation-ready garments.<n>We first employ a large language model to interpret textual descriptions into parameterized body shapes.<n>Next, we develop topology-preserving with novel geometric losses to adapt garments precisely to body geometries.<n>An enhanced texture diffusion module with a symmetric local attention mechanism ensures both view consistency and photorealistic details.
- Score: 23.39291332667773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Creating detailed 3D human avatars with garments typically requires specialized expertise and labor-intensive processes. Although recent advances in generative AI have enabled text-to-3D human/clothing generation, current methods fall short in offering accessible, integrated pipelines for producing ready-to-use clothed avatars. To solve this, we introduce Tailor, an integrated text-to-avatar system that generates high-fidelity, customizable 3D humans with simulation-ready garments. Our system includes a three-stage pipeline. We first employ a large language model to interpret textual descriptions into parameterized body shapes and semantically matched garment templates. Next, we develop topology-preserving deformation with novel geometric losses to adapt garments precisely to body geometries. Furthermore, an enhanced texture diffusion module with a symmetric local attention mechanism ensures both view consistency and photorealistic details. Quantitative and qualitative evaluations demonstrate that Tailor outperforms existing SoTA methods in terms of fidelity, usability, and diversity. Code will be available for academic use.
Related papers
- FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images [74.86864398919467]
We present a novel method for reconstructing personalized 3D human avatars with realistic animation from only a few images.
We learn a universal prior from over a thousand clothed humans to achieve instant feedforward generation and zero-shot generalization.
Our method generates more authentic reconstruction and animation than state-of-the-arts, and can be directly generalized to inputs from casually taken phone photos.
arXiv Detail & Related papers (2025-03-24T23:20:47Z) - StdGEN: Semantic-Decomposed 3D Character Generation from Single Images [28.302030751098354]
StdGEN is an innovative pipeline for generating semantically high-quality 3D characters from single images.<n>It generates intricately detailed 3D characters with separated semantic components such as the body, clothes, and hair, in three minutes.<n>StdGEN offers ready-to-use semantic-decomposed 3D characters and enables flexible customization for a wide range of applications.
arXiv Detail & Related papers (2024-11-08T17:54:18Z) - ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling [96.87575334960258]
ID-to-3D is a method to generate identity- and text-guided 3D human heads with disentangled expressions.
Results achieve an unprecedented level of identity-consistent and high-quality texture and geometry generation.
arXiv Detail & Related papers (2024-05-26T13:36:45Z) - GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details [31.92583566128599]
Traditional 3D garment creation is labor-intensive, involving sketching, modeling, UV mapping, and time-consuming processes.
We propose GarmentDreamer, a novel method that leverages 3D Gaussian Splatting (GS) as guidance to generate 3D garment from text prompts.
arXiv Detail & Related papers (2024-05-20T23:54:28Z) - DressCode: Autoregressively Sewing and Generating Garments from Text Guidance [61.48120090970027]
DressCode aims to democratize design for novices and offer immense potential in fashion design, virtual try-on, and digital human creation.
We first introduce SewingGPT, a GPT-based architecture integrating cross-attention with text-conditioned embedding to generate sewing patterns.
We then tailor a pre-trained Stable Diffusion to generate tile-based Physically-based Rendering (PBR) textures for the garments.
arXiv Detail & Related papers (2024-01-29T16:24:21Z) - 3D-GPT: Procedural 3D Modeling with Large Language Models [47.72968643115063]
We introduce 3D-GPT, a framework utilizing large language models(LLMs) for instruction-driven 3D modeling.
3D-GPT positions LLMs as proficient problem solvers, dissecting the procedural 3D modeling tasks into accessible segments and appointing the apt agent for each task.
Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results but also collaborates effectively with human designers.
arXiv Detail & Related papers (2023-10-19T17:41:48Z) - TADA! Text to Animatable Digital Avatars [57.52707683788961]
TADA takes textual descriptions and produces expressive 3D avatars with high-quality geometry and lifelike textures.
We derive an optimizable high-resolution body model from SMPL-X with 3D displacements and a texture map.
We render normals and RGB images of the generated character and exploit their latent embeddings in the SDS training process.
arXiv Detail & Related papers (2023-08-21T17:59:10Z) - Learning Locally Editable Virtual Humans [37.95173373011365]
We propose a novel hybrid representation and end-to-end trainable network architecture to model fully editable neural avatars.
At the core of our work lies a representation that combines the modeling power of neural fields with the ease of use and inherent 3D consistency of skinned meshes.
Our method generates diverse detailed avatars and achieves better model fitting performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-04-28T23:06:17Z) - Text-Conditional Contextualized Avatars For Zero-Shot Personalization [47.85747039373798]
We propose a pipeline that enables personalization of image generation with avatars capturing a user's identity in a delightful way.
Our pipeline is zero-shot, avatar texture and style agnostic, and does not require training on the avatar at all.
We show, for the first time, how to leverage large-scale image datasets to learn human 3D pose parameters.
arXiv Detail & Related papers (2023-04-14T22:00:44Z) - SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation [89.47132156950194]
We present a novel framework built to simplify 3D asset generation for amateur users.
Our method supports a variety of input modalities that can be easily provided by a human.
Our model can combine all these tasks into one swiss-army-knife tool.
arXiv Detail & Related papers (2022-12-08T18:59:05Z) - Combining Implicit Function Learning and Parametric Models for 3D Human
Reconstruction [123.62341095156611]
Implicit functions represented as deep learning approximations are powerful for reconstructing 3D surfaces.
Such features are essential in building flexible models for both computer graphics and computer vision.
We present methodology that combines detail-rich implicit functions and parametric representations.
arXiv Detail & Related papers (2020-07-22T13:46:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.