Related papers: Single Stage Warped Cloth Learning and Semantic-Contextual Attention Feature Fusion for Virtual TryOn

Single Stage Warped Cloth Learning and Semantic-Contextual Attention Feature Fusion for Virtual TryOn

URL: http://arxiv.org/abs/2310.05024v2
Date: Sat, 25 May 2024 14:22:51 GMT
Title: Single Stage Warped Cloth Learning and Semantic-Contextual Attention Feature Fusion for Virtual TryOn
Authors: Sanhita Pathak, Vinay Kaushik, Brejesh Lall,
Abstract summary: Image-based virtual try-on aims to fit an in-shop garment onto a clothed person image. Garment warping, which aligns the target garment with the corresponding body parts in the person image, is a crucial step in achieving this goal. We propose a novel single-stage framework that implicitly learns the same without explicit multi-stage learning.
Score: 5.790630195329777
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Image-based virtual try-on aims to fit an in-shop garment onto a clothed person image. Garment warping, which aligns the target garment with the corresponding body parts in the person image, is a crucial step in achieving this goal. Existing methods often use multi-stage frameworks to handle clothes warping, person body synthesis and tryon generation separately or rely on noisy intermediate parser-based labels. We propose a novel single-stage framework that implicitly learns the same without explicit multi-stage learning. Our approach utilizes a novel semantic-contextual fusion attention module for garment-person feature fusion, enabling efficient and realistic cloth warping and body synthesis from target pose keypoints. By introducing a lightweight linear attention framework that attends to garment regions and fuses multiple sampled flow fields, we also address misalignment and artifacts present in previous methods. To achieve simultaneous learning of warped garment and try-on results, we introduce a Warped Cloth Learning Module. Our proposed approach significantly improves the quality and efficiency of virtual try-on methods, providing users with a more reliable and realistic virtual try-on experience.

Related papers

One Model For All: Partial Diffusion for Unified Try-On and Try-Off in Any Pose [99.056324701764]
We introduce textbfOMFA (emphOne Model For All), a unified diffusion framework for both virtual try-on and try-off.<n>The framework is entirely mask-free and requires only a single portrait and a target pose as input.<n>It achieves state-of-the-art results on both try-on and try-off tasks, providing a practical and generalizable solution for virtual garment synthesis.
arXiv Detail & Related papers (2025-08-06T15:46:01Z)
Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment. We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images. We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z)
StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On [35.227896906556026]
Given a clothing image and a person image, an image-based virtual try-on aims to generate a customized image that appears natural and accurately reflects the characteristics of the clothing image. In this work, we aim to expand the applicability of the pre-trained diffusion model so that it can be utilized independently for the virtual try-on task. Our proposed zero cross-attention blocks not only preserve the clothing details by learning the semantic correspondence but also generate high-fidelity images by utilizing the inherent knowledge of the pre-trained model in the warping process.
arXiv Detail & Related papers (2023-12-04T08:27:59Z)
Improving Human-Object Interaction Detection via Virtual Image Learning [68.56682347374422]
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects. In this paper, we propose to alleviate the impact of such an unbalanced distribution via Virtual Image Leaning (VIL) A novel label-to-image approach, Multiple Steps Image Creation (MUSIC), is proposed to create a high-quality dataset that has a consistent distribution with real images.
arXiv Detail & Related papers (2023-08-04T10:28:48Z)
PG-VTON: A Novel Image-Based Virtual Try-On Method via Progressive Inference Paradigm [6.929743379017671]
We propose a novel virtual try-on method via progressive inference paradigm (PGVTON) We exploit the try-on parsing as the shape guidance and implement the garment try-on via warping-mapping-composition. Experiments demonstrate that our method has state-of-the-art performance under two challenging scenarios.
arXiv Detail & Related papers (2023-04-18T12:47:26Z)
OccluMix: Towards De-Occlusion Virtual Try-on by Semantically-Guided Mixup [79.3118064406151]
Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes) Prior methods successfully preserve the character of clothing images. Occlusion remains a pernicious effect for realistic virtual try-on.
arXiv Detail & Related papers (2023-01-03T06:29:11Z)
Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image. We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation. Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z)
Toward Accurate and Realistic Outfits Visualization with Attention to Details [10.655149697873716]
We propose Outfit Visualization Net to capture important visual details necessary for commercial applications. OVNet consists of 1) a semantic layout generator and 2) an image generation pipeline using multiple coordinated warps. An interactive interface powered by this method has been deployed on fashion e-commerce websites and received overwhelmingly positive feedback.
arXiv Detail & Related papers (2021-06-11T19:53:34Z)
Cloth Interactive Transformer for Virtual Try-On [106.21605249649957]
We propose a novel two-stage cloth interactive transformer (CIT) method for the virtual try-on task. In the first stage, we design a CIT matching block, aiming to precisely capture the long-range correlations between the cloth-agnostic person information and the in-shop cloth information. In the second stage, we put forth a CIT reasoning block for establishing global mutual interactive dependencies among person representation, the warped clothing item, and the corresponding warped cloth mask.
arXiv Detail & Related papers (2021-04-12T14:45:32Z)
Style and Pose Control for Image Synthesis of Humans from a Single Monocular View [78.6284090004218]
StylePoseGAN is a non-controllable generator to accept conditioning of pose and appearance separately. Our network can be trained in a fully supervised way with human images to disentangle pose, appearance and body parts. StylePoseGAN achieves state-of-the-art image generation fidelity on common perceptual metrics.
arXiv Detail & Related papers (2021-02-22T18:50:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.