Multi-Garment Customized Model Generation
- URL: http://arxiv.org/abs/2408.05206v1
- Date: Fri, 9 Aug 2024 17:57:33 GMT
- Title: Multi-Garment Customized Model Generation
- Authors: Yichen Liu, Penghui Du, Yi Liu Quanwei Zhang,
- Abstract summary: Multi-Garment Customized Model Generation is a unified framework based on Latent Diffusion Models (LDMs)
Our framework supports the conditional generation of multiple garments through decoupled multi-garment feature fusion.
The proposed garment encoder is a plug-and-play module that can be combined with other extension modules.
- Score: 3.1679243514285194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces Multi-Garment Customized Model Generation, a unified framework based on Latent Diffusion Models (LDMs) aimed at addressing the unexplored task of synthesizing images with free combinations of multiple pieces of clothing. The method focuses on generating customized models wearing various targeted outfits according to different text prompts. The primary challenge lies in maintaining the natural appearance of the dressed model while preserving the complex textures of each piece of clothing, ensuring that the information from different garments does not interfere with each other. To tackle these challenges, we first developed a garment encoder, which is a trainable UNet copy with shared weights, capable of extracting detailed features of garments in parallel. Secondly, our framework supports the conditional generation of multiple garments through decoupled multi-garment feature fusion, allowing multiple clothing features to be injected into the backbone network, significantly alleviating conflicts between garment information. Additionally, the proposed garment encoder is a plug-and-play module that can be combined with other extension modules such as IP-Adapter and ControlNet, enhancing the diversity and controllability of the generated models. Extensive experiments demonstrate the superiority of our approach over existing alternatives, opening up new avenues for the task of generating images with multiple-piece clothing combinations
Related papers
- Learning to Synthesize Compatible Fashion Items Using Semantic Alignment and Collocation Classification: An Outfit Generation Framework [59.09707044733695]
We propose a novel outfit generation framework, i.e., OutfitGAN, with the aim of synthesizing an entire outfit.
OutfitGAN includes a semantic alignment module, which is responsible for characterizing the mapping correspondence between the existing fashion items and the synthesized ones.
In order to evaluate the performance of our proposed models, we built a large-scale dataset consisting of 20,000 fashion outfits.
arXiv Detail & Related papers (2025-02-05T12:13:53Z) - BC-GAN: A Generative Adversarial Network for Synthesizing a Batch of Collocated Clothing [17.91576511810969]
Collocated clothing synthesis using generative networks has significant potential economic value to increase revenue in the fashion industry.
We introduce a novel batch clothing generation framework, named BC-GAN, which is able to synthesize multiple visually-collocated clothing images simultaneously.
Our model was examined in a large-scale dataset with compatible outfits constructed by ourselves.
arXiv Detail & Related papers (2025-02-03T05:41:41Z) - FCBoost-Net: A Generative Network for Synthesizing Multiple Collocated Outfits via Fashion Compatibility Boosting [37.32190866187711]
We present FCBoost-Net, a new framework for outfit generation that leverages the power of pre-trained generative models.
FCBoost-Net randomly synthesizes multiple sets of fashion items, and the compatibility of the synthesized sets is then improved in several rounds using a novel fashion compatibility booster.
Empirical evidence indicates that the proposed strategy can improve the fashion compatibility of randomly synthesized fashion items as well as maintain their diversity.
arXiv Detail & Related papers (2025-02-03T02:18:09Z) - Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation [52.13927859375693]
We propose SewingLDM, a multi-modal generative model that generates sewing patterns controlled by text prompts, body shapes, and garment sketches.
To learn the sewing pattern distribution in the latent space, we design a two-step training strategy.
Comprehensive qualitative and quantitative experiments show the effectiveness of our proposed method.
arXiv Detail & Related papers (2024-12-19T02:05:28Z) - AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models [7.534556848810697]
We propose a novel AnyDressing method for customizing characters conditioned on any combination of garments and personalized text prompts.
AnyDressing comprises two primary networks named GarmentsNet and DressingNet, which are respectively dedicated to extracting detailed clothing features.
We introduce a Garment-Enhanced Texture Learning strategy to improve the fine-grained texture details of garments.
arXiv Detail & Related papers (2024-12-05T13:16:47Z) - AIpparel: A Large Multimodal Generative Model for Digital Garments [71.12933771326279]
We introduce AIpparel, a large multimodal model for generating and editing sewing patterns.
Our model fine-tunes state-of-the-art large multimodal models on a custom-curated large-scale dataset of over 120,000 unique garments.
We propose a novel tokenization scheme that concisely encodes these complex sewing patterns so that LLMs can learn to predict them efficiently.
arXiv Detail & Related papers (2024-12-05T07:35:19Z) - IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions.
IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE.
We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z) - AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario [50.62711489896909]
AnyFit surpasses all baselines on high-resolution benchmarks and real-world data by a large gap.
AnyFit's impressive performance on high-fidelity virtual try-ons in any scenario from any image, paves a new path for future research within the fashion community.
arXiv Detail & Related papers (2024-05-28T13:33:08Z) - MMTryon: Multi-Modal Multi-Reference Control for High-Quality Fashion Generation [70.83668869857665]
MMTryon is a multi-modal multi-reference VIrtual Try-ON framework.
It can generate high-quality compositional try-on results by taking a text instruction and multiple garment images as inputs.
arXiv Detail & Related papers (2024-05-01T11:04:22Z) - Magic Clothing: Controllable Garment-Driven Image Synthesis [7.46772222515689]
We propose Magic Clothing, a latent diffusion model (LDM)-based network architecture for an unexplored garment-driven image synthesis task.
Aiming at generating customized characters wearing the target garments with diverse text prompts, the image controllability is the most critical issue.
We introduce a garment extractor to capture the detailed garment features, and employ self-attention fusion to incorporate them into the pretrained LDMs.
arXiv Detail & Related papers (2024-04-15T07:15:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.