C-VTON: Context-Driven Image-Based Virtual Try-On Network
- URL: http://arxiv.org/abs/2212.04437v1
- Date: Thu, 8 Dec 2022 17:56:34 GMT
- Title: C-VTON: Context-Driven Image-Based Virtual Try-On Network
- Authors: Benjamin Fele and Ajda Lampe and Peter Peer and Vitomir \v{S}truc
- Abstract summary: We propose a Context-Driven Virtual Try-On Network (C-VTON) that convincingly transfers selected clothing items to the target subjects.
At the core of the C-VTON pipeline are: (i) a geometric matching procedure that efficiently aligns the target clothing with the pose of the person in the input images, and (ii) a powerful image generator that utilizes various types of contextual information when the final try-on result.
- Score: 1.0832844764942349
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image-based virtual try-on techniques have shown great promise for enhancing
the user-experience and improving customer satisfaction on fashion-oriented
e-commerce platforms. However, existing techniques are currently still limited
in the quality of the try-on results they are able to produce from input images
of diverse characteristics. In this work, we propose a Context-Driven Virtual
Try-On Network (C-VTON) that addresses these limitations and convincingly
transfers selected clothing items to the target subjects even under challenging
pose configurations and in the presence of self-occlusions. At the core of the
C-VTON pipeline are: (i) a geometric matching procedure that efficiently aligns
the target clothing with the pose of the person in the input images, and (ii) a
powerful image generator that utilizes various types of contextual information
when synthesizing the final try-on result. C-VTON is evaluated in rigorous
experiments on the VITON and MPV datasets and in comparison to state-of-the-art
techniques from the literature. Experimental results show that the proposed
approach is able to produce photo-realistic and visually convincing results and
significantly improves on the existing state-of-the-art.
Related papers
- ITVTON:Virtual Try-On Diffusion Transformer Model Based on Integrated Image and Text [0.0]
We introduce ITVTON, a method that enhances clothing-character interactions by combining clothing and character images along spatial channels as inputs.
We incorporate integrated textual descriptions from multiple images to boost the realism of the generated visual effects.
In experiments, ITVTON outperforms baseline methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2025-01-28T07:24:15Z) - ODPG: Outfitting Diffusion with Pose Guided Condition [2.5602836891933074]
VTON technology allows users to visualize how clothes would look on them without physically trying them on.
Traditional VTON methods, often using Geneversarative Adrial Networks (GANs) and Diffusion models, face challenges in achieving high realism and handling dynamic poses.
This paper introduces Outfitting Diffusion with Pose Guided Condition (ODPG), a novel approach that leverages a latent diffusion model with multiple conditioning inputs during the denoising process.
arXiv Detail & Related papers (2025-01-12T10:30:27Z) - TryOffAnyone: Tiled Cloth Generation from a Dressed Person [1.4732811715354452]
High-fidelity tiled garment images are essential for personalized recommendations, outfit composition, and virtual try-on systems.
We propose a novel approach utilizing a fine-tuned StableDiffusion model.
Our method features a streamlined single-stage network design, which integrates garmentspecific masks to isolate and process target clothing items effectively.
arXiv Detail & Related papers (2024-12-11T17:41:53Z) - Beyond Imperfections: A Conditional Inpainting Approach for End-to-End Artifact Removal in VTON and Pose Transfer [2.990411348977783]
Artifacts often degrade the visual quality of virtual try-on (VTON) and pose transfer applications.
This study introduces a novel conditional inpainting technique designed to detect and remove such distortions.
arXiv Detail & Related papers (2024-10-05T06:18:26Z) - IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions.
IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE.
We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z) - Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.
We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z) - Improving Human-Object Interaction Detection via Virtual Image Learning [68.56682347374422]
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects.
In this paper, we propose to alleviate the impact of such an unbalanced distribution via Virtual Image Leaning (VIL)
A novel label-to-image approach, Multiple Steps Image Creation (MUSIC), is proposed to create a high-quality dataset that has a consistent distribution with real images.
arXiv Detail & Related papers (2023-08-04T10:28:48Z) - PT-VTON: an Image-Based Virtual Try-On Network with Progressive Pose
Attention Transfer [11.96427084717743]
PT-VTON is a pose-transfer-based framework for cloth transfer that enables virtual try-on with arbitrary poses.
PT-VTON can be applied to the fashion industry within minimal modification of existing systems.
arXiv Detail & Related papers (2021-11-23T21:51:08Z) - Intriguing Properties of Vision Transformers [114.28522466830374]
Vision transformers (ViT) have demonstrated impressive performance across various machine vision problems.
We systematically study this question via an extensive set of experiments and comparisons with a high-performing convolutional neural network (CNN)
We show effective features of ViTs are due to flexible receptive and dynamic fields possible via the self-attention mechanism.
arXiv Detail & Related papers (2021-05-21T17:59:18Z) - IMAGINE: Image Synthesis by Image-Guided Model Inversion [79.4691654458141]
We introduce an inversion based method, denoted as IMAge-Guided model INvErsion (IMAGINE), to generate high-quality and diverse images.
We leverage the knowledge of image semantics from a pre-trained classifier to achieve plausible generations.
IMAGINE enables the synthesis procedure to simultaneously 1) enforce semantic specificity constraints during the synthesis, 2) produce realistic images without generator training, and 3) give users intuitive control over the generation process.
arXiv Detail & Related papers (2021-04-13T02:00:24Z) - Cloth Interactive Transformer for Virtual Try-On [106.21605249649957]
We propose a novel two-stage cloth interactive transformer (CIT) method for the virtual try-on task.
In the first stage, we design a CIT matching block, aiming to precisely capture the long-range correlations between the cloth-agnostic person information and the in-shop cloth information.
In the second stage, we put forth a CIT reasoning block for establishing global mutual interactive dependencies among person representation, the warped clothing item, and the corresponding warped cloth mask.
arXiv Detail & Related papers (2021-04-12T14:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.