Related papers: A Two-stage Personalized Virtual Try-on Framework with Shape Control and Texture Guidance

A Two-stage Personalized Virtual Try-on Framework with Shape Control and Texture Guidance

URL: http://arxiv.org/abs/2312.15480v1
Date: Sun, 24 Dec 2023 13:32:55 GMT
Title: A Two-stage Personalized Virtual Try-on Framework with Shape Control and Texture Guidance
Authors: Shufang Zhang, Minxue Ni, Lei Wang, Wenxin Ding, Shuai Chen, Yuhong Liu
Abstract summary: This paper proposes a brand new personalized virtual try-on model (PE-VITON), which uses the two stages (shape control and texture guidance) to decouple the clothing attributes. The proposed model can effectively solve the problems of weak reduction of clothing folds, poor generation effect under complex human posture, blurred edges of clothing, and unclear texture styles in traditional try-on methods.
Score: 7.302929117437442
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Diffusion model has a strong ability to generate wild images. However, the model can just generate inaccurate images with the guidance of text, which makes it very challenging to directly apply the text-guided generative model for virtual try-on scenarios. Taking images as guiding conditions of the diffusion model, this paper proposes a brand new personalized virtual try-on model (PE-VITON), which uses the two stages (shape control and texture guidance) to decouple the clothing attributes. Specifically, the proposed model adaptively matches the clothing to human body parts through the Shape Control Module (SCM) to mitigate the misalignment of the clothing and the human body parts. The semantic information of the input clothing is parsed by the Texture Guided Module (TGM), and the corresponding texture is generated by directional guidance. Therefore, this model can effectively solve the problems of weak reduction of clothing folds, poor generation effect under complex human posture, blurred edges of clothing, and unclear texture styles in traditional try-on methods. Meanwhile, the model can automatically enhance the generated clothing folds and textures according to the human posture, and improve the authenticity of virtual try-on. In this paper, qualitative and quantitative experiments are carried out on high-resolution paired and unpaired datasets, the results show that the proposed model outperforms the state-of-the-art model.

Related papers

DressWild: Feed-Forward Pose-Agnostic Garment Sewing Pattern Generation from In-the-Wild Images [50.11081091174558]
This paper focuses on sewing pattern generation for garment modeling and fabrication applications.<n>We propose DressWild, a novel feed-forward pipeline that reconstructs physics-consistent 2D sewing patterns and the corresponding 3D garments from a single in-the-wild image.
arXiv Detail & Related papers (2026-02-18T14:45:15Z)
VITON-DRR: Details Retention Virtual Try-on via Non-rigid Registration [5.465426769865638]
This paper proposes a detail retention virtual try-on method via accurate non-rigid registration (VITON-DRR) for diverse human poses.<n> Specifically, we reconstruct a human semantic segmentation using a dual-pyramid-structured feature extractor.<n>Then, a novel Deformation Module is designed for extracting the cloth key points and warping them through an accurate non-rigid registration algorithm.
arXiv Detail & Related papers (2025-05-29T13:38:21Z)
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling [20.072689146353348]
We introduce a garment extraction model that generates (human, synthetic garment) pairs from a single image of a clothed individual. We also propose an Error-Aware Refinement-based Schr"odinger Bridge (EARSB) that surgically targets localized generation errors. In user studies, our model is preferred by the users in an average of 59% of cases.
arXiv Detail & Related papers (2025-01-08T18:25:50Z)
DRDM: A Disentangled Representations Diffusion Model for Synthesizing Realistic Person Images [9.768951663960257]
We propose a Disentangled Representations Diffusion Model (DRDM) to generate photo-realistic images from source portraits. First, a pose encoder is responsible for encoding pose features into a high-dimensional space to guide the generation of person images. Second, a body-part subspace decoupling block (BSDB) disentangles features from the different body parts of a source figure and feeds them to the various layers of the noise prediction block.
arXiv Detail & Related papers (2024-12-25T06:36:24Z)
FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models [14.596090302381647]
This paper studies photorealism enhancement of rendered images, leveraging generative power from diffusion models on the controlled basis of rendering. We introduce a novel framework to translate rendered images into their realistic counterparts, which consists of two stages: Domain Knowledge Injection (DKI) and Realistic Image Generation (RIG)
arXiv Detail & Related papers (2024-10-18T12:48:22Z)
Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On [29.217423805933727]
Diffusion model-based approaches have recently become popular, as they are excellent at image synthesis tasks. We propose an Texture-Preserving Diffusion (TPD) model for virtual try-on, which enhances the fidelity of the results. Second, we propose a novel diffusion-based method that predicts a precise inpainting mask based on the person and reference garment images.
arXiv Detail & Related papers (2024-04-01T12:43:22Z)
Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment. We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images. We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z)
StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On [35.227896906556026]
Given a clothing image and a person image, an image-based virtual try-on aims to generate a customized image that appears natural and accurately reflects the characteristics of the clothing image. In this work, we aim to expand the applicability of the pre-trained diffusion model so that it can be utilized independently for the virtual try-on task. Our proposed zero cross-attention blocks not only preserve the clothing details by learning the semantic correspondence but also generate high-fidelity images by utilizing the inherent knowledge of the pre-trained model in the warping process.
arXiv Detail & Related papers (2023-12-04T08:27:59Z)
Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis. Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z)
Weakly Supervised High-Fidelity Clothing Model Generation [67.32235668920192]
We propose a cheap yet scalable weakly-supervised method called Deep Generative Projection (DGP) to address this specific scenario. We show that projecting the rough alignment of clothing and body onto the StyleGAN space can yield photo-realistic wearing results.
arXiv Detail & Related papers (2021-12-14T07:15:15Z)
Shape Controllable Virtual Try-on for Underwear Models [0.0]
We propose a Shape Controllable Virtual Try-On Network (SC-VTON) to dress clothing for underwear models. SC-VTON integrates information of model and clothing to generate warped clothing image. Our method can generate high-resolution results with detailed textures.
arXiv Detail & Related papers (2021-07-28T04:01:01Z)
Controllable Person Image Synthesis with Spatially-Adaptive Warped Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes. We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters. We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z)
PISE: Person Image Synthesis and Editing with Decoupled GAN [64.70360318367943]
We propose PISE, a novel two-stage generative model for Person Image Synthesis and Editing. For human pose transfer, we first synthesize a human parsing map aligned with the target pose to represent the shape of clothing. To decouple the shape and style of clothing, we propose joint global and local per-region encoding and normalization.
arXiv Detail & Related papers (2021-03-06T04:32:06Z)
Neural 3D Clothes Retargeting from a Single Image [91.5030622330039]
We present a method of clothes; generating the potential poses and deformations of a given 3D clothing template model to fit onto a person in a single RGB image. The problem is fundamentally ill-posed as attaining the ground truth data is impossible, i.e. images of people wearing the different 3D clothing template model model at exact same pose. We propose a semi-supervised learning framework that validates the physical plausibility of 3D deformation by matching with the prescribed body-to-cloth contact points and clothing to fit onto the unlabeled silhouette.
arXiv Detail & Related papers (2021-01-29T20:50:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.