PT-VTON: an Image-Based Virtual Try-On Network with Progressive Pose
Attention Transfer
- URL: http://arxiv.org/abs/2111.12167v1
- Date: Tue, 23 Nov 2021 21:51:08 GMT
- Title: PT-VTON: an Image-Based Virtual Try-On Network with Progressive Pose
Attention Transfer
- Authors: Hanhan Zhou, Tian Lan, Guru Venkataramani
- Abstract summary: PT-VTON is a pose-transfer-based framework for cloth transfer that enables virtual try-on with arbitrary poses.
PT-VTON can be applied to the fashion industry within minimal modification of existing systems.
- Score: 11.96427084717743
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The virtual try-on system has gained great attention due to its potential to
give customers a realistic, personalized product presentation in virtualized
settings. In this paper, we present PT-VTON, a novel pose-transfer-based
framework for cloth transfer that enables virtual try-on with arbitrary poses.
PT-VTON can be applied to the fashion industry within minimal modification of
existing systems while satisfying the overall visual fashionability and
detailed fabric appearance requirements. It enables efficient clothes
transferring between model and user images with arbitrary pose and body shape.
We implement a prototype of PT-VTON and demonstrate that our system can match
or surpass many other approaches when facing a drastic variation of poses by
preserving detailed human and fabric characteristic appearances. PT-VTON is
shown to outperform alternative approaches both on machine-based quantitative
metrics and qualitative results.
Related papers
- IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions.
IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE.
We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z) - Self-Supervised Vision Transformer for Enhanced Virtual Clothes Try-On [21.422611451978863]
We introduce an innovative approach for virtual clothes try-on, utilizing a self-supervised Vision Transformer (ViT) and a diffusion model.
Our method emphasizes detail enhancement by contrasting local clothing image embeddings, generated by ViT, with their global counterparts.
The experimental results showcase substantial advancements in the realism and precision of details in virtual try-on experiences.
arXiv Detail & Related papers (2024-06-15T07:46:22Z) - AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario [50.62711489896909]
AnyFit surpasses all baselines on high-resolution benchmarks and real-world data by a large gap.
AnyFit's impressive performance on high-fidelity virtual try-ons in any scenario from any image, paves a new path for future research within the fashion community.
arXiv Detail & Related papers (2024-05-28T13:33:08Z) - VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation [79.99551055245071]
We propose VividPose, an end-to-end pipeline that ensures superior temporal stability.
An identity-aware appearance controller integrates additional facial information without compromising other appearance details.
A geometry-aware pose controller utilizes both dense rendering maps from SMPL-X and sparse skeleton maps.
VividPose exhibits superior generalization capabilities on our proposed in-the-wild dataset.
arXiv Detail & Related papers (2024-05-28T13:18:32Z) - PFDM: Parser-Free Virtual Try-on via Diffusion Model [28.202996582963184]
We propose a free virtual try-on method based on the diffusion model (PFDM)
Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information.
Experiments demonstrate that our proposed PFDM can successfully handle complex images, and outperform both state-of-the-art-free and high-fidelity-based models.
arXiv Detail & Related papers (2024-02-05T14:32:57Z) - C-VTON: Context-Driven Image-Based Virtual Try-On Network [1.0832844764942349]
We propose a Context-Driven Virtual Try-On Network (C-VTON) that convincingly transfers selected clothing items to the target subjects.
At the core of the C-VTON pipeline are: (i) a geometric matching procedure that efficiently aligns the target clothing with the pose of the person in the input images, and (ii) a powerful image generator that utilizes various types of contextual information when the final try-on result.
arXiv Detail & Related papers (2022-12-08T17:56:34Z) - Single Stage Multi-Pose Virtual Try-On [119.95115739956661]
Multi-pose virtual try-on (MPVTON) aims to fit a target garment onto a person at a target pose.
MPVTON provides a better try-on experience, but is also more challenging due to the dual garment and pose editing objectives.
Existing methods adopt a pipeline comprising three disjoint modules including a target semantic layout prediction module, a coarse try-on image generator and a refinement try-on image generator.
In this paper, we propose a novel single stage model forTON. Key to our model is a parallel flow estimation module that predicts the flow fields for both person and garment images conditioned on
arXiv Detail & Related papers (2022-11-19T15:02:11Z) - Drivable Volumetric Avatars using Texel-Aligned Features [52.89305658071045]
Photo telepresence requires both high-fidelity body modeling and faithful driving to enable dynamically synthesized appearance.
We propose an end-to-end framework that addresses two core challenges in modeling and driving full-body avatars of real people.
arXiv Detail & Related papers (2022-07-20T09:28:16Z) - Style and Pose Control for Image Synthesis of Humans from a Single
Monocular View [78.6284090004218]
StylePoseGAN is a non-controllable generator to accept conditioning of pose and appearance separately.
Our network can be trained in a fully supervised way with human images to disentangle pose, appearance and body parts.
StylePoseGAN achieves state-of-the-art image generation fidelity on common perceptual metrics.
arXiv Detail & Related papers (2021-02-22T18:50:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.