SPG-VTON: Semantic Prediction Guidance for Multi-pose Virtual Try-on
- URL: http://arxiv.org/abs/2108.01578v1
- Date: Tue, 3 Aug 2021 15:40:50 GMT
- Title: SPG-VTON: Semantic Prediction Guidance for Multi-pose Virtual Try-on
- Authors: Bingwen Hu, Ping Liu, Zhedong Zheng, and Mingwu Ren
- Abstract summary: Image-based virtual try-on is challenging in fitting a target in-shop clothes into a reference person under diverse human poses.
We propose an end-to-end Semantic Prediction Guidance multi-pose Virtual Try-On Network (SPG-VTON)
We evaluate the proposed method on the most massive multi-pose dataset (MPV) and the DeepFashion dataset.
- Score: 27.870740623131816
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image-based virtual try-on is challenging in fitting a target in-shop clothes
into a reference person under diverse human poses. Previous works focus on
preserving clothing details ( e.g., texture, logos, patterns ) when
transferring desired clothes onto a target person under a fixed pose. However,
the performances of existing methods significantly dropped when extending
existing methods to multi-pose virtual try-on. In this paper, we propose an
end-to-end Semantic Prediction Guidance multi-pose Virtual Try-On Network
(SPG-VTON), which could fit the desired clothing into a reference person under
arbitrary poses. Concretely, SPG-VTON is composed of three sub-modules. First,
a Semantic Prediction Module (SPM) generates the desired semantic map. The
predicted semantic map provides more abundant guidance to locate the desired
clothes region and produce a coarse try-on image. Second, a Clothes Warping
Module (CWM) warps in-shop clothes to the desired shape according to the
predicted semantic map and the desired pose. Specifically, we introduce a
conductible cycle consistency loss to alleviate the misalignment in the clothes
warping process. Third, a Try-on Synthesis Module (TSM) combines the coarse
result and the warped clothes to generate the final virtual try-on image,
preserving details of the desired clothes and under the desired pose. Besides,
we introduce a face identity loss to refine the facial appearance and maintain
the identity of the final virtual try-on result at the same time. We evaluate
the proposed method on the most massive multi-pose dataset (MPV) and the
DeepFashion dataset. The qualitative and quantitative experiments show that
SPG-VTON is superior to the state-of-the-art methods and is robust to the data
noise, including background and accessory changes, i.e., hats and handbags,
showing good scalability to the real-world scenario.
Related papers
- IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions.
IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE.
We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z) - MV-VTON: Multi-View Virtual Try-On with Diffusion Models [91.71150387151042]
The goal of image-based virtual try-on is to generate an image of the target person naturally wearing the given clothing.
Existing methods solely focus on the frontal try-on using the frontal clothing.
We introduce Multi-View Virtual Try-ON (MV-VTON), which aims to reconstruct the dressing results from multiple views using the given clothes.
arXiv Detail & Related papers (2024-04-26T12:27:57Z) - Arbitrary Virtual Try-On Network: Characteristics Preservation and
Trade-off between Body and Clothing [85.74977256940855]
We propose an Arbitrary Virtual Try-On Network (AVTON) for all-type clothes.
AVTON can synthesize realistic try-on images by preserving and trading off characteristics of the target clothes and the reference person.
Our approach can achieve better performance compared with the state-of-the-art virtual try-on methods.
arXiv Detail & Related papers (2021-11-24T08:59:56Z) - Towards Scalable Unpaired Virtual Try-On via Patch-Routed
Spatially-Adaptive GAN [66.3650689395967]
We propose a texture-preserving end-to-end network, the PAtch-routed SpaTially-Adaptive GAN (PASTA-GAN), that facilitates real-world unpaired virtual try-on.
To disentangle the style and spatial information of each garment, PASTA-GAN consists of an innovative patch-routed disentanglement module.
arXiv Detail & Related papers (2021-11-20T08:36:12Z) - Shape Controllable Virtual Try-on for Underwear Models [0.0]
We propose a Shape Controllable Virtual Try-On Network (SC-VTON) to dress clothing for underwear models.
SC-VTON integrates information of model and clothing to generate warped clothing image.
Our method can generate high-resolution results with detailed textures.
arXiv Detail & Related papers (2021-07-28T04:01:01Z) - PISE: Person Image Synthesis and Editing with Decoupled GAN [64.70360318367943]
We propose PISE, a novel two-stage generative model for Person Image Synthesis and Editing.
For human pose transfer, we first synthesize a human parsing map aligned with the target pose to represent the shape of clothing.
To decouple the shape and style of clothing, we propose joint global and local per-region encoding and normalization.
arXiv Detail & Related papers (2021-03-06T04:32:06Z) - LGVTON: A Landmark Guided Approach to Virtual Try-On [4.617329011921226]
Given the images of two people: a person and a model, it generates a rendition of the person wearing the clothes of the model.
This is useful considering the fact that on most e-commerce websites images of only clothes are not usually available.
arXiv Detail & Related papers (2020-04-01T16:49:57Z) - Towards Photo-Realistic Virtual Try-On by Adaptively
Generating$\leftrightarrow$Preserving Image Content [85.24260811659094]
We propose a novel visual try-on network, namely Adaptive Content Generating and Preserving Network (ACGPN)
ACGPN first predicts semantic layout of the reference image that will be changed after try-on.
Second, a clothes warping module warps clothing images according to the generated semantic layout.
Third, an inpainting module for content fusion integrates all information (e.g. reference image, semantic layout, warped clothes) to adaptively produce each semantic part of human body.
arXiv Detail & Related papers (2020-03-12T15:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.