Related papers: SPG-VTON: Semantic Prediction Guidance for Multi-pose Virtual Try-on

SPG-VTON: Semantic Prediction Guidance for Multi-pose Virtual Try-on

URL: http://arxiv.org/abs/2108.01578v1
Date: Tue, 3 Aug 2021 15:40:50 GMT
Title: SPG-VTON: Semantic Prediction Guidance for Multi-pose Virtual Try-on
Authors: Bingwen Hu, Ping Liu, Zhedong Zheng, and Mingwu Ren
Abstract summary: Image-based virtual try-on is challenging in fitting a target in-shop clothes into a reference person under diverse human poses. We propose an end-to-end Semantic Prediction Guidance multi-pose Virtual Try-On Network (SPG-VTON) We evaluate the proposed method on the most massive multi-pose dataset (MPV) and the DeepFashion dataset.
Score: 27.870740623131816
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Image-based virtual try-on is challenging in fitting a target in-shop clothes into a reference person under diverse human poses. Previous works focus on preserving clothing details ( e.g., texture, logos, patterns ) when transferring desired clothes onto a target person under a fixed pose. However, the performances of existing methods significantly dropped when extending existing methods to multi-pose virtual try-on. In this paper, we propose an end-to-end Semantic Prediction Guidance multi-pose Virtual Try-On Network (SPG-VTON), which could fit the desired clothing into a reference person under arbitrary poses. Concretely, SPG-VTON is composed of three sub-modules. First, a Semantic Prediction Module (SPM) generates the desired semantic map. The predicted semantic map provides more abundant guidance to locate the desired clothes region and produce a coarse try-on image. Second, a Clothes Warping Module (CWM) warps in-shop clothes to the desired shape according to the predicted semantic map and the desired pose. Specifically, we introduce a conductible cycle consistency loss to alleviate the misalignment in the clothes warping process. Third, a Try-on Synthesis Module (TSM) combines the coarse result and the warped clothes to generate the final virtual try-on image, preserving details of the desired clothes and under the desired pose. Besides, we introduce a face identity loss to refine the facial appearance and maintain the identity of the final virtual try-on result at the same time. We evaluate the proposed method on the most massive multi-pose dataset (MPV) and the DeepFashion dataset. The qualitative and quantitative experiments show that SPG-VTON is superior to the state-of-the-art methods and is robust to the data noise, including background and accessory changes, i.e., hats and handbags, showing good scalability to the real-world scenario.

Related papers

One Model For All: Partial Diffusion for Unified Try-On and Try-Off in Any Pose [99.056324701764]
We introduce textbfOMFA (emphOne Model For All), a unified diffusion framework for both virtual try-on and try-off.<n>The framework is entirely mask-free and requires only a single portrait and a target pose as input.<n>It achieves state-of-the-art results on both try-on and try-off tasks, providing a practical and generalizable solution for virtual garment synthesis.
arXiv Detail & Related papers (2025-08-06T15:46:01Z)
VITON-DRR: Details Retention Virtual Try-on via Non-rigid Registration [5.465426769865638]
This paper proposes a detail retention virtual try-on method via accurate non-rigid registration (VITON-DRR) for diverse human poses.<n> Specifically, we reconstruct a human semantic segmentation using a dual-pyramid-structured feature extractor.<n>Then, a novel Deformation Module is designed for extracting the cloth key points and warping them through an accurate non-rigid registration algorithm.
arXiv Detail & Related papers (2025-05-29T13:38:21Z)
Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On [89.9123806553489]
Diffusion models have shown success in virtual try-on (VTON) task.<n>The problem remains challenging to preserve the shape and every detail of the given garment due to the intrinsicity of diffusion model.<n>We propose to explicitly capitalize on visual correspondence as the prior to tame diffusion process.
arXiv Detail & Related papers (2025-05-22T17:52:13Z)
Shape-Guided Clothing Warping for Virtual Try-On [6.750870148213539]
Image-based virtual try-on aims to seamlessly fit in-shop clothing to a person image. We propose a novel shape-guided clothing warping method for virtual try-on, dubbed SCW-VTON.
arXiv Detail & Related papers (2025-04-21T17:08:36Z)
Limb-Aware Virtual Try-On Network with Progressive Clothing Warping [64.84181064722084]
Image-based virtual try-on aims to transfer an in-shop clothing image to a person image. Most existing methods adopt a single global deformation to perform clothing warping directly. We propose Limb-aware Virtual Try-on Network named PL-VTON, which performs fine-grained clothing warping progressively.
arXiv Detail & Related papers (2025-03-18T09:52:41Z)
IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions. IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE. We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z)
MV-VTON: Multi-View Virtual Try-On with Diffusion Models [91.71150387151042]
The goal of image-based virtual try-on is to generate an image of the target person naturally wearing the given clothing. Existing methods solely focus on the frontal try-on using the frontal clothing. We introduce Multi-View Virtual Try-ON (MV-VTON), which aims to reconstruct the dressing results from multiple views using the given clothes.
arXiv Detail & Related papers (2024-04-26T12:27:57Z)
Arbitrary Virtual Try-On Network: Characteristics Preservation and Trade-off between Body and Clothing [85.74977256940855]
We propose an Arbitrary Virtual Try-On Network (AVTON) for all-type clothes. AVTON can synthesize realistic try-on images by preserving and trading off characteristics of the target clothes and the reference person. Our approach can achieve better performance compared with the state-of-the-art virtual try-on methods.
arXiv Detail & Related papers (2021-11-24T08:59:56Z)
Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN [66.3650689395967]
We propose a texture-preserving end-to-end network, the PAtch-routed SpaTially-Adaptive GAN (PASTA-GAN), that facilitates real-world unpaired virtual try-on. To disentangle the style and spatial information of each garment, PASTA-GAN consists of an innovative patch-routed disentanglement module.
arXiv Detail & Related papers (2021-11-20T08:36:12Z)
Shape Controllable Virtual Try-on for Underwear Models [0.0]
We propose a Shape Controllable Virtual Try-On Network (SC-VTON) to dress clothing for underwear models. SC-VTON integrates information of model and clothing to generate warped clothing image. Our method can generate high-resolution results with detailed textures.
arXiv Detail & Related papers (2021-07-28T04:01:01Z)
PISE: Person Image Synthesis and Editing with Decoupled GAN [64.70360318367943]
We propose PISE, a novel two-stage generative model for Person Image Synthesis and Editing. For human pose transfer, we first synthesize a human parsing map aligned with the target pose to represent the shape of clothing. To decouple the shape and style of clothing, we propose joint global and local per-region encoding and normalization.
arXiv Detail & Related papers (2021-03-06T04:32:06Z)
LGVTON: A Landmark Guided Approach to Virtual Try-On [4.617329011921226]
Given the images of two people: a person and a model, it generates a rendition of the person wearing the clothes of the model. This is useful considering the fact that on most e-commerce websites images of only clothes are not usually available.
arXiv Detail & Related papers (2020-04-01T16:49:57Z)
Towards Photo-Realistic Virtual Try-On by Adaptively Generating$\leftrightarrow$Preserving Image Content [85.24260811659094]
We propose a novel visual try-on network, namely Adaptive Content Generating and Preserving Network (ACGPN) ACGPN first predicts semantic layout of the reference image that will be changed after try-on. Second, a clothes warping module warps clothing images according to the generated semantic layout. Third, an inpainting module for content fusion integrates all information (e.g. reference image, semantic layout, warped clothes) to adaptively produce each semantic part of human body.
arXiv Detail & Related papers (2020-03-12T15:55:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.