Taming the Power of Diffusion Models for High-Quality Virtual Try-On
with Appearance Flow
- URL: http://arxiv.org/abs/2308.06101v1
- Date: Fri, 11 Aug 2023 12:23:09 GMT
- Title: Taming the Power of Diffusion Models for High-Quality Virtual Try-On
with Appearance Flow
- Authors: Junhong Gou, Siyu Sun, Jianfu Zhang, Jianlou Si, Chen Qian, Liqing
Zhang
- Abstract summary: Virtual try-on is a critical image synthesis task that aims to transfer clothes from one image to another while preserving the details of both humans and clothes.
We propose an exemplar-based inpainting approach that leverages a warping module to guide the diffusion model's generation effectively.
Our approach, namely Diffusion-based Conditional Inpainting for Virtual Try-ON (DCI-VTON), effectively utilizes the power of the diffusion model.
- Score: 24.187109053871833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Virtual try-on is a critical image synthesis task that aims to transfer
clothes from one image to another while preserving the details of both humans
and clothes. While many existing methods rely on Generative Adversarial
Networks (GANs) to achieve this, flaws can still occur, particularly at high
resolutions. Recently, the diffusion model has emerged as a promising
alternative for generating high-quality images in various applications.
However, simply using clothes as a condition for guiding the diffusion model to
inpaint is insufficient to maintain the details of the clothes. To overcome
this challenge, we propose an exemplar-based inpainting approach that leverages
a warping module to guide the diffusion model's generation effectively. The
warping module performs initial processing on the clothes, which helps to
preserve the local details of the clothes. We then combine the warped clothes
with clothes-agnostic person image and add noise as the input of diffusion
model. Additionally, the warped clothes is used as local conditions for each
denoising process to ensure that the resulting output retains as much detail as
possible. Our approach, namely Diffusion-based Conditional Inpainting for
Virtual Try-ON (DCI-VTON), effectively utilizes the power of the diffusion
model, and the incorporation of the warping module helps to produce
high-quality and realistic virtual try-on results. Experimental results on
VITON-HD demonstrate the effectiveness and superiority of our method.
Related papers
- ODPG: Outfitting Diffusion with Pose Guided Condition [2.5602836891933074]
VTON technology allows users to visualize how clothes would look on them without physically trying them on.
Traditional VTON methods, often using Geneversarative Adrial Networks (GANs) and Diffusion models, face challenges in achieving high realism and handling dynamic poses.
This paper introduces Outfitting Diffusion with Pose Guided Condition (ODPG), a novel approach that leverages a latent diffusion model with multiple conditioning inputs during the denoising process.
arXiv Detail & Related papers (2025-01-12T10:30:27Z) - DiffusionTrend: A Minimalist Approach to Virtual Fashion Try-On [103.89972383310715]
DiffusionTrend harnesses latent information rich in prior information to capture the nuances of garment details.
It delivers a visually compelling try-on experience, underscoring the potential of training-free diffusion model.
arXiv Detail & Related papers (2024-12-19T02:24:35Z) - Improving Virtual Try-On with Garment-focused Diffusion Models [91.95830983115474]
Diffusion models have led to the revolutionizing of generative modeling in numerous image synthesis tasks.
We shape a new Diffusion model, namely GarDiff, which triggers the garment-focused diffusion process.
Experiments on VITON-HD and DressCode datasets demonstrate the superiority of our GarDiff when compared to state-of-the-art VTON approaches.
arXiv Detail & Related papers (2024-09-12T17:55:11Z) - FLDM-VTON: Faithful Latent Diffusion Model for Virtual Try-on [21.34959824429241]
FLDM-VTON is a novel Faithful Latent Diffusion Model for VTON.
It incorporates clothes as both the starting point and local condition, supplying the model with faithful clothes priors.
It is able to generate photo-realistic try-on images with faithful clothing details.
arXiv Detail & Related papers (2024-04-22T13:21:09Z) - Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On [29.217423805933727]
Diffusion model-based approaches have recently become popular, as they are excellent at image synthesis tasks.
We propose an Texture-Preserving Diffusion (TPD) model for virtual try-on, which enhances the fidelity of the results.
Second, we propose a novel diffusion-based method that predicts a precise inpainting mask based on the person and reference garment images.
arXiv Detail & Related papers (2024-04-01T12:43:22Z) - Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment.
We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images.
We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z) - WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual
Try-on [81.15988741258683]
Image-based Virtual Try-On (VITON) aims to transfer an in-shop garment image onto a target person.
Current methods often overlook the synthesis quality around the garment-skin boundary and realistic effects like wrinkles and shadows on the warped garments.
We propose WarpDiffusion, which bridges the warping-based and diffusion-based paradigms via a novel informative and local garment feature attention mechanism.
arXiv Detail & Related papers (2023-12-06T18:34:32Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales.
Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.