IMAGGarment-1: Fine-Grained Garment Generation for Controllable Fashion Design
- URL: http://arxiv.org/abs/2504.13176v1
- Date: Thu, 17 Apr 2025 17:59:47 GMT
- Title: IMAGGarment-1: Fine-Grained Garment Generation for Controllable Fashion Design
- Authors: Fei Shen, Jian Yu, Cong Wang, Xin Jiang, Xiaoyu Du, Jinhui Tang,
- Abstract summary: IMAGGarment-1 is a fine-grained garment generation framework.<n>It enables high-fidelity garment synthesis with precise control over silhouette, color, and logo placement.
- Score: 44.46962562795136
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents IMAGGarment-1, a fine-grained garment generation (FGG) framework that enables high-fidelity garment synthesis with precise control over silhouette, color, and logo placement. Unlike existing methods that are limited to single-condition inputs, IMAGGarment-1 addresses the challenges of multi-conditional controllability in personalized fashion design and digital apparel applications. Specifically, IMAGGarment-1 employs a two-stage training strategy to separately model global appearance and local details, while enabling unified and controllable generation through end-to-end inference. In the first stage, we propose a global appearance model that jointly encodes silhouette and color using a mixed attention module and a color adapter. In the second stage, we present a local enhancement model with an adaptive appearance-aware module to inject user-defined logos and spatial constraints, enabling accurate placement and visual consistency. To support this task, we release GarmentBench, a large-scale dataset comprising over 180K garment samples paired with multi-level design conditions, including sketches, color references, logo placements, and textual prompts. Extensive experiments demonstrate that our method outperforms existing baselines, achieving superior structural stability, color fidelity, and local controllability performance. The code and model are available at https://github.com/muzishen/IMAGGarment-1.
Related papers
- Fine-Grained Controllable Apparel Showcase Image Generation via Garment-Centric Outpainting [39.50293003775675]
We propose a novel garment-centric outpainting (GCO) framework based on the latent diffusion model (LDM)<n>The proposed framework aims at customizing a fashion model wearing a given garment via text prompts and facial images.
arXiv Detail & Related papers (2025-03-03T08:30:37Z) - AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models [7.534556848810697]
We propose a novel AnyDressing method for customizing characters conditioned on any combination of garments and personalized text prompts.
AnyDressing comprises two primary networks named GarmentsNet and DressingNet, which are respectively dedicated to extracting detailed clothing features.
We introduce a Garment-Enhanced Texture Learning strategy to improve the fine-grained texture details of garments.
arXiv Detail & Related papers (2024-12-05T13:16:47Z) - AIpparel: A Multimodal Foundation Model for Digital Garments [71.12933771326279]
We introduce AIpparel, a multimodal foundation model for generating and editing sewing patterns.
Our model fine-tunes state-of-the-art large multimodal models on a custom-curated large-scale dataset of over 120,000 unique garments.
We propose a novel tokenization scheme that concisely encodes these complex sewing patterns so that LLMs can learn to predict them efficiently.
arXiv Detail & Related papers (2024-12-05T07:35:19Z) - Multi-Garment Customized Model Generation [3.1679243514285194]
Multi-Garment Customized Model Generation is a unified framework based on Latent Diffusion Models (LDMs)
Our framework supports the conditional generation of multiple garments through decoupled multi-garment feature fusion.
The proposed garment encoder is a plug-and-play module that can be combined with other extension modules.
arXiv Detail & Related papers (2024-08-09T17:57:33Z) - IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions.
IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE.
We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z) - MMTryon: Multi-Modal Multi-Reference Control for High-Quality Fashion Generation [70.83668869857665]
MMTryon is a multi-modal multi-reference VIrtual Try-ON framework.
It can generate high-quality compositional try-on results by taking a text instruction and multiple garment images as inputs.
arXiv Detail & Related papers (2024-05-01T11:04:22Z) - StableGarment: Garment-Centric Generation via Stable Diffusion [29.5112874761836]
We introduce StableGarment, a unified framework to tackle garment-centric(GC) generation tasks.
Our solution involves the development of a garment encoder, a trainable copy of the denoising UNet equipped with additive self-attention layers.
The incorporation of a dedicated try-on ControlNet enables StableGarment to execute virtual try-on tasks with precision.
arXiv Detail & Related papers (2024-03-16T03:05:07Z) - Composer: Creative and Controllable Image Synthesis with Composable
Conditions [57.78533372393828]
Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.
This work offers a new generation paradigm that allows flexible control of the output image, such as spatial layout and palette, while maintaining the synthesis quality and model creativity.
arXiv Detail & Related papers (2023-02-20T05:48:41Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image.
We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation.
Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.