C-VTON: Context-Driven Image-Based Virtual Try-On Network
- URL: http://arxiv.org/abs/2212.04437v1
- Date: Thu, 8 Dec 2022 17:56:34 GMT
- Title: C-VTON: Context-Driven Image-Based Virtual Try-On Network
- Authors: Benjamin Fele and Ajda Lampe and Peter Peer and Vitomir \v{S}truc
- Abstract summary: We propose a Context-Driven Virtual Try-On Network (C-VTON) that convincingly transfers selected clothing items to the target subjects.
At the core of the C-VTON pipeline are: (i) a geometric matching procedure that efficiently aligns the target clothing with the pose of the person in the input images, and (ii) a powerful image generator that utilizes various types of contextual information when the final try-on result.
- Score: 1.0832844764942349
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image-based virtual try-on techniques have shown great promise for enhancing
the user-experience and improving customer satisfaction on fashion-oriented
e-commerce platforms. However, existing techniques are currently still limited
in the quality of the try-on results they are able to produce from input images
of diverse characteristics. In this work, we propose a Context-Driven Virtual
Try-On Network (C-VTON) that addresses these limitations and convincingly
transfers selected clothing items to the target subjects even under challenging
pose configurations and in the presence of self-occlusions. At the core of the
C-VTON pipeline are: (i) a geometric matching procedure that efficiently aligns
the target clothing with the pose of the person in the input images, and (ii) a
powerful image generator that utilizes various types of contextual information
when synthesizing the final try-on result. C-VTON is evaluated in rigorous
experiments on the VITON and MPV datasets and in comparison to state-of-the-art
techniques from the literature. Experimental results show that the proposed
approach is able to produce photo-realistic and visually convincing results and
significantly improves on the existing state-of-the-art.
Related papers
- Beyond Imperfections: A Conditional Inpainting Approach for End-to-End Artifact Removal in VTON and Pose Transfer [2.990411348977783]
Artifacts often degrade the visual quality of virtual try-on (VTON) and pose transfer applications.
This study introduces a novel conditional inpainting technique designed to detect and remove such distortions.
arXiv Detail & Related papers (2024-10-05T06:18:26Z) - IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions.
IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE.
We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z) - DiCTI: Diffusion-based Clothing Designer via Text-guided Input [5.275658744475251]
DiCTI (Diffusion-based Clothing Designer via Text-guided Input) allows designers to quickly visualize fashion-related ideas using text inputs only.
By leveraging a powerful diffusion-based inpainting model conditioned on text inputs, DiCTI is able to synthesize convincing, high-quality images with varied clothing designs.
arXiv Detail & Related papers (2024-07-04T12:48:36Z) - Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.
We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z) - Time-Efficient and Identity-Consistent Virtual Try-On Using A Variant of Altered Diffusion Models [4.038493506169702]
This study emphasizes the challenges of preserving intricate texture details and distinctive features of the target person and the clothes in various scenarios.
Various existing approaches are explored, highlighting the limitations and unresolved aspects.
It then proposes a novel diffusion-based solution that addresses garment texture preservation and user identity retention during virtual try-on.
arXiv Detail & Related papers (2024-03-12T07:15:29Z) - Improving Human-Object Interaction Detection via Virtual Image Learning [68.56682347374422]
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects.
In this paper, we propose to alleviate the impact of such an unbalanced distribution via Virtual Image Leaning (VIL)
A novel label-to-image approach, Multiple Steps Image Creation (MUSIC), is proposed to create a high-quality dataset that has a consistent distribution with real images.
arXiv Detail & Related papers (2023-08-04T10:28:48Z) - PT-VTON: an Image-Based Virtual Try-On Network with Progressive Pose
Attention Transfer [11.96427084717743]
PT-VTON is a pose-transfer-based framework for cloth transfer that enables virtual try-on with arbitrary poses.
PT-VTON can be applied to the fashion industry within minimal modification of existing systems.
arXiv Detail & Related papers (2021-11-23T21:51:08Z) - Data Augmentation using Random Image Cropping for High-resolution
Virtual Try-On (VITON-CROP) [18.347532903864597]
VITON-CROP synthesizes images more robustly when integrated with random crop augmentation compared to the existing state-of-the-art virtual try-on models.
In the experiments, we demonstrate that VITON-CROP is superior to VITON-HD both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-11-16T07:40:16Z) - Intriguing Properties of Vision Transformers [114.28522466830374]
Vision transformers (ViT) have demonstrated impressive performance across various machine vision problems.
We systematically study this question via an extensive set of experiments and comparisons with a high-performing convolutional neural network (CNN)
We show effective features of ViTs are due to flexible receptive and dynamic fields possible via the self-attention mechanism.
arXiv Detail & Related papers (2021-05-21T17:59:18Z) - IMAGINE: Image Synthesis by Image-Guided Model Inversion [79.4691654458141]
We introduce an inversion based method, denoted as IMAge-Guided model INvErsion (IMAGINE), to generate high-quality and diverse images.
We leverage the knowledge of image semantics from a pre-trained classifier to achieve plausible generations.
IMAGINE enables the synthesis procedure to simultaneously 1) enforce semantic specificity constraints during the synthesis, 2) produce realistic images without generator training, and 3) give users intuitive control over the generation process.
arXiv Detail & Related papers (2021-04-13T02:00:24Z) - Cloth Interactive Transformer for Virtual Try-On [106.21605249649957]
We propose a novel two-stage cloth interactive transformer (CIT) method for the virtual try-on task.
In the first stage, we design a CIT matching block, aiming to precisely capture the long-range correlations between the cloth-agnostic person information and the in-shop cloth information.
In the second stage, we put forth a CIT reasoning block for establishing global mutual interactive dependencies among person representation, the warped clothing item, and the corresponding warped cloth mask.
arXiv Detail & Related papers (2021-04-12T14:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.