Multi-Garment Customized Model Generation
- URL: http://arxiv.org/abs/2408.05206v1
- Date: Fri, 9 Aug 2024 17:57:33 GMT
- Title: Multi-Garment Customized Model Generation
- Authors: Yichen Liu, Penghui Du, Yi Liu Quanwei Zhang,
- Abstract summary: Multi-Garment Customized Model Generation is a unified framework based on Latent Diffusion Models (LDMs)
Our framework supports the conditional generation of multiple garments through decoupled multi-garment feature fusion.
The proposed garment encoder is a plug-and-play module that can be combined with other extension modules.
- Score: 3.1679243514285194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces Multi-Garment Customized Model Generation, a unified framework based on Latent Diffusion Models (LDMs) aimed at addressing the unexplored task of synthesizing images with free combinations of multiple pieces of clothing. The method focuses on generating customized models wearing various targeted outfits according to different text prompts. The primary challenge lies in maintaining the natural appearance of the dressed model while preserving the complex textures of each piece of clothing, ensuring that the information from different garments does not interfere with each other. To tackle these challenges, we first developed a garment encoder, which is a trainable UNet copy with shared weights, capable of extracting detailed features of garments in parallel. Secondly, our framework supports the conditional generation of multiple garments through decoupled multi-garment feature fusion, allowing multiple clothing features to be injected into the backbone network, significantly alleviating conflicts between garment information. Additionally, the proposed garment encoder is a plug-and-play module that can be combined with other extension modules such as IP-Adapter and ControlNet, enhancing the diversity and controllability of the generated models. Extensive experiments demonstrate the superiority of our approach over existing alternatives, opening up new avenues for the task of generating images with multiple-piece clothing combinations
Related papers
- TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation [67.97044071594257]
TweedieMix is a novel method for composing customized diffusion models.
Our framework can be effortlessly extended to image-to-video diffusion models.
arXiv Detail & Related papers (2024-10-08T01:06:01Z) - IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions.
IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE.
We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z) - AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario [50.62711489896909]
AnyFit surpasses all baselines on high-resolution benchmarks and real-world data by a large gap.
AnyFit's impressive performance on high-fidelity virtual try-ons in any scenario from any image, paves a new path for future research within the fashion community.
arXiv Detail & Related papers (2024-05-28T13:33:08Z) - MMTryon: Multi-Modal Multi-Reference Control for High-Quality Fashion Generation [70.83668869857665]
MMTryon is a multi-modal multi-reference VIrtual Try-ON framework.
It can generate high-quality compositional try-on results by taking a text instruction and multiple garment images as inputs.
arXiv Detail & Related papers (2024-05-01T11:04:22Z) - Magic Clothing: Controllable Garment-Driven Image Synthesis [7.46772222515689]
We propose Magic Clothing, a latent diffusion model (LDM)-based network architecture for an unexplored garment-driven image synthesis task.
Aiming at generating customized characters wearing the target garments with diverse text prompts, the image controllability is the most critical issue.
We introduce a garment extractor to capture the detailed garment features, and employ self-attention fusion to incorporate them into the pretrained LDMs.
arXiv Detail & Related papers (2024-04-15T07:15:39Z) - High-Quality Animatable Dynamic Garment Reconstruction from Monocular
Videos [51.8323369577494]
We propose the first method to recover high-quality animatable dynamic garments from monocular videos without depending on scanned data.
To generate reasonable deformations for various unseen poses, we propose a learnable garment deformation network.
We show that our method can reconstruct high-quality dynamic garments with coherent surface details, which can be easily animated under unseen poses.
arXiv Detail & Related papers (2023-11-02T13:16:27Z) - Transformer-based Graph Neural Networks for Outfit Generation [22.86041284499166]
TGNN exploits multi-headed self attention to capture relations between clothing items in a graph as a message passing step in Convolutional Graph Neural Networks.
We propose a transformer-based architecture, which exploits multi-headed self attention to capture relations between clothing items in a graph as a message passing step in Convolutional Graph Neural Networks.
arXiv Detail & Related papers (2023-04-17T09:18:45Z) - ClothCombo: Modeling Inter-Cloth Interaction for Draping Multi-Layered
Clothes [3.8079353598215757]
We present ClothCombo, a pipeline to drape arbitrary combinations of clothes on 3D human models.
Our method utilizes a GNN-based network to efficiently model the interaction between clothes in different layers.
arXiv Detail & Related papers (2023-04-07T06:23:54Z) - Toward Accurate and Realistic Outfits Visualization with Attention to
Details [10.655149697873716]
We propose Outfit Visualization Net to capture important visual details necessary for commercial applications.
OVNet consists of 1) a semantic layout generator and 2) an image generation pipeline using multiple coordinated warps.
An interactive interface powered by this method has been deployed on fashion e-commerce websites and received overwhelmingly positive feedback.
arXiv Detail & Related papers (2021-06-11T19:53:34Z) - SMPLicit: Topology-aware Generative Model for Clothed People [65.84665248796615]
We introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry.
In the experimental section, we demonstrate SMPLicit can be readily used for fitting 3D scans and for 3D reconstruction in images of dressed people.
arXiv Detail & Related papers (2021-03-11T18:57:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.