DiffFashion: Reference-based Fashion Design with Structure-aware
Transfer by Diffusion Models
- URL: http://arxiv.org/abs/2302.06826v1
- Date: Tue, 14 Feb 2023 04:45:44 GMT
- Title: DiffFashion: Reference-based Fashion Design with Structure-aware
Transfer by Diffusion Models
- Authors: Shidong Cao, Wenhao Chai, Shengyu Hao, Yanting Zhang, Hangyue Chen,
and Gaoang Wang
- Abstract summary: We focus on a new fashion design task, where we aim to transfer a reference appearance image onto a clothing image.
It is a challenging task since there are no reference images available for the newly designed output fashion images.
We present a novel diffusion model-based unsupervised structure-aware transfer method to semantically generate new clothes.
- Score: 4.918209527904503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image-based fashion design with AI techniques has attracted increasing
attention in recent years. We focus on a new fashion design task, where we aim
to transfer a reference appearance image onto a clothing image while preserving
the structure of the clothing image. It is a challenging task since there are
no reference images available for the newly designed output fashion images.
Although diffusion-based image translation or neural style transfer (NST) has
enabled flexible style transfer, it is often difficult to maintain the original
structure of the image realistically during the reverse diffusion, especially
when the referenced appearance image greatly differs from the common clothing
appearance. To tackle this issue, we present a novel diffusion model-based
unsupervised structure-aware transfer method to semantically generate new
clothes from a given clothing image and a reference appearance image. In
specific, we decouple the foreground clothing with automatically generated
semantic masks by conditioned labels. And the mask is further used as guidance
in the denoising process to preserve the structure information. Moreover, we
use the pre-trained vision Transformer (ViT) for both appearance and structure
guidance. Our experimental results show that the proposed method outperforms
state-of-the-art baseline models, generating more realistic images in the
fashion design task. Code and demo can be found at
https://github.com/Rem105-210/DiffFashion.
Related papers
- Improving Virtual Try-On with Garment-focused Diffusion Models [91.95830983115474]
Diffusion models have led to the revolutionizing of generative modeling in numerous image synthesis tasks.
We shape a new Diffusion model, namely GarDiff, which triggers the garment-focused diffusion process.
Experiments on VITON-HD and DressCode datasets demonstrate the superiority of our GarDiff when compared to state-of-the-art VTON approaches.
arXiv Detail & Related papers (2024-09-12T17:55:11Z) - Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment.
We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images.
We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z) - StableVITON: Learning Semantic Correspondence with Latent Diffusion
Model for Virtual Try-On [35.227896906556026]
Given a clothing image and a person image, an image-based virtual try-on aims to generate a customized image that appears natural and accurately reflects the characteristics of the clothing image.
In this work, we aim to expand the applicability of the pre-trained diffusion model so that it can be utilized independently for the virtual try-on task.
Our proposed zero cross-attention blocks not only preserve the clothing details by learning the semantic correspondence but also generate high-fidelity images by utilizing the inherent knowledge of the pre-trained model in the warping process.
arXiv Detail & Related papers (2023-12-04T08:27:59Z) - DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer [27.39248034592382]
We propose using a new class of models to perform style transfer while enabling deformable style transfer.
We show how leveraging the priors of these models can expose new artistic controls at inference time.
arXiv Detail & Related papers (2023-07-09T12:13:43Z) - Multimodal Garment Designer: Human-Centric Latent Diffusion Models for
Fashion Image Editing [40.70752781891058]
We propose the task of multimodal-conditioned fashion image editing, guiding the generation of human-centric fashion images.
We tackle this problem by proposing a new architecture based on latent diffusion models.
Given the lack of existing datasets suitable for the task, we also extend two existing fashion datasets.
arXiv Detail & Related papers (2023-04-04T18:03:04Z) - Diffusion-based Image Translation using Disentangled Style and Content
Representation [51.188396199083336]
Diffusion-based image translation guided by semantic texts or a single target image has enabled flexible style transfer.
It is often difficult to maintain the original content of the image during the reverse diffusion.
We present a novel diffusion-based unsupervised image translation method using disentangled style and content representation.
Our experimental results show that the proposed method outperforms state-of-the-art baseline models in both text-guided and image-guided translation tasks.
arXiv Detail & Related papers (2022-09-30T06:44:37Z) - Style-Based Global Appearance Flow for Virtual Try-On [119.95115739956661]
A novel global appearance flow estimation model is proposed in this work.
Experiment results on a popular virtual try-on benchmark show that our method achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-04-03T10:58:04Z) - Weakly Supervised High-Fidelity Clothing Model Generation [67.32235668920192]
We propose a cheap yet scalable weakly-supervised method called Deep Generative Projection (DGP) to address this specific scenario.
We show that projecting the rough alignment of clothing and body onto the StyleGAN space can yield photo-realistic wearing results.
arXiv Detail & Related papers (2021-12-14T07:15:15Z) - Towards Photo-Realistic Virtual Try-On by Adaptively
Generating$\leftrightarrow$Preserving Image Content [85.24260811659094]
We propose a novel visual try-on network, namely Adaptive Content Generating and Preserving Network (ACGPN)
ACGPN first predicts semantic layout of the reference image that will be changed after try-on.
Second, a clothes warping module warps clothing images according to the generated semantic layout.
Third, an inpainting module for content fusion integrates all information (e.g. reference image, semantic layout, warped clothes) to adaptively produce each semantic part of human body.
arXiv Detail & Related papers (2020-03-12T15:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.