Cloth Interactive Transformer for Virtual Try-On
- URL: http://arxiv.org/abs/2104.05519v2
- Date: Sun, 20 Aug 2023 18:53:52 GMT
- Title: Cloth Interactive Transformer for Virtual Try-On
- Authors: Bin Ren, Hao Tang, Fanyang Meng, Runwei Ding, Philip H.S. Torr, Nicu
Sebe
- Abstract summary: We propose a novel two-stage cloth interactive transformer (CIT) method for the virtual try-on task.
In the first stage, we design a CIT matching block, aiming to precisely capture the long-range correlations between the cloth-agnostic person information and the in-shop cloth information.
In the second stage, we put forth a CIT reasoning block for establishing global mutual interactive dependencies among person representation, the warped clothing item, and the corresponding warped cloth mask.
- Score: 106.21605249649957
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The 2D image-based virtual try-on has aroused increased interest from the
multimedia and computer vision fields due to its enormous commercial value.
Nevertheless, most existing image-based virtual try-on approaches directly
combine the person-identity representation and the in-shop clothing items
without taking their mutual correlations into consideration. Moreover, these
methods are commonly established on pure convolutional neural networks (CNNs)
architectures which are not simple to capture the long-range correlations among
the input pixels. As a result, it generally results in inconsistent results. To
alleviate these issues, in this paper, we propose a novel two-stage cloth
interactive transformer (CIT) method for the virtual try-on task. During the
first stage, we design a CIT matching block, aiming to precisely capture the
long-range correlations between the cloth-agnostic person information and the
in-shop cloth information. Consequently, it makes the warped in-shop clothing
items look more natural in appearance. In the second stage, we put forth a CIT
reasoning block for establishing global mutual interactive dependencies among
person representation, the warped clothing item, and the corresponding warped
cloth mask. The empirical results, based on mutual dependencies, demonstrate
that the final try-on results are more realistic. Substantial empirical results
on a public fashion dataset illustrate that the suggested CIT attains
competitive virtual try-on performance.
Related papers
- Hierarchical Cross-Attention Network for Virtual Try-On [59.50297858307268]
We present an innovative solution for the challenges of the virtual try-on task: our novel Hierarchical Cross-Attention Network (HCANet)
HCANet is crafted with two primary stages: geometric matching and try-on, each playing a crucial role in delivering realistic virtual try-on outcomes.
A key feature of HCANet is the incorporation of a novel Hierarchical Cross-Attention (HCA) block into both stages, enabling the effective capture of long-range correlations between individual and clothing modalities.
arXiv Detail & Related papers (2024-11-23T12:39:58Z) - Time-Efficient and Identity-Consistent Virtual Try-On Using A Variant of Altered Diffusion Models [4.038493506169702]
This study emphasizes the challenges of preserving intricate texture details and distinctive features of the target person and the clothes in various scenarios.
Various existing approaches are explored, highlighting the limitations and unresolved aspects.
It then proposes a novel diffusion-based solution that addresses garment texture preservation and user identity retention during virtual try-on.
arXiv Detail & Related papers (2024-03-12T07:15:29Z) - Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment.
We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images.
We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z) - PFDM: Parser-Free Virtual Try-on via Diffusion Model [28.202996582963184]
We propose a free virtual try-on method based on the diffusion model (PFDM)
Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information.
Experiments demonstrate that our proposed PFDM can successfully handle complex images, and outperform both state-of-the-art-free and high-fidelity-based models.
arXiv Detail & Related papers (2024-02-05T14:32:57Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - Improving Human-Object Interaction Detection via Virtual Image Learning [68.56682347374422]
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects.
In this paper, we propose to alleviate the impact of such an unbalanced distribution via Virtual Image Leaning (VIL)
A novel label-to-image approach, Multiple Steps Image Creation (MUSIC), is proposed to create a high-quality dataset that has a consistent distribution with real images.
arXiv Detail & Related papers (2023-08-04T10:28:48Z) - VIRT: Improving Representation-based Models for Text Matching through
Virtual Interaction [50.986371459817256]
We propose a novel textitVirtual InteRacTion mechanism, termed as VIRT, to enable full and deep interaction modeling in representation-based models.
VIRT asks representation-based encoders to conduct virtual interactions to mimic the behaviors as interaction-based models do.
arXiv Detail & Related papers (2021-12-08T09:49:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.