Related papers: AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario

AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario

URL: http://arxiv.org/abs/2405.18172v1
Date: Tue, 28 May 2024 13:33:08 GMT
Title: AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario
Authors: Yuhan Li, Hao Zhou, Wenxiang Shang, Ran Lin, Xuanhong Chen, Bingbing Ni,
Abstract summary: AnyFit surpasses all baselines on high-resolution benchmarks and real-world data by a large gap. AnyFit's impressive performance on high-fidelity virtual try-ons in any scenario from any image, paves a new path for future research within the fashion community.
Score: 50.62711489896909
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: While image-based virtual try-on has made significant strides, emerging approaches still fall short of delivering high-fidelity and robust fitting images across various scenarios, as their models suffer from issues of ill-fitted garment styles and quality degrading during the training process, not to mention the lack of support for various combinations of attire. Therefore, we first propose a lightweight, scalable, operator known as Hydra Block for attire combinations. This is achieved through a parallel attention mechanism that facilitates the feature injection of multiple garments from conditionally encoded branches into the main network. Secondly, to significantly enhance the model's robustness and expressiveness in real-world scenarios, we evolve its potential across diverse settings by synthesizing the residuals of multiple models, as well as implementing a mask region boost strategy to overcome the instability caused by information leakage in existing models. Equipped with the above design, AnyFit surpasses all baselines on high-resolution benchmarks and real-world data by a large gap, excelling in producing well-fitting garments replete with photorealistic and rich details. Furthermore, AnyFit's impressive performance on high-fidelity virtual try-ons in any scenario from any image, paves a new path for future research within the fashion community.

Related papers

CrossVTON: Mimicking the Logic Reasoning on Cross-category Virtual Try-on guided by Tri-zone Priors [63.95051258676488]
CrossVTON is a framework for generating robust fitting images for cross-category virtual try-on. It disentangles the complex reasoning required for cross-category try-on into a structured framework. It achieves state-of-the-art performance, surpassing existing baselines in both qualitative and quantitative evaluations.
arXiv Detail & Related papers (2025-02-20T09:05:35Z)
TryOffAnyone: Tiled Cloth Generation from a Dressed Person [1.4732811715354452]
High-fidelity tiled garment images are essential for personalized recommendations, outfit composition, and virtual try-on systems. We propose a novel approach utilizing a fine-tuned StableDiffusion model. Our method features a streamlined single-stage network design, which integrates garmentspecific masks to isolate and process target clothing items effectively.
arXiv Detail & Related papers (2024-12-11T17:41:53Z)
FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on [73.13242624924814]
Garment perception enhancement technique, FitDiT, is designed for high-fidelity virtual try-on using Diffusion Transformers (DiT) We introduce a garment texture extractor that incorporates garment priors evolution to fine-tune garment feature, facilitating to better capture rich details such as stripes, patterns, and text. We also employ a dilated-relaxed mask strategy that adapts to the correct length of garments, preventing the generation of garments that fill the entire mask area during cross-category try-on.
arXiv Detail & Related papers (2024-11-15T11:02:23Z)
High-Fidelity Virtual Try-on with Large-Scale Unpaired Learning [36.7085107012134]
Virtual try-on (VTON) transfers a target clothing image to a reference person, where clothing fidelity is a key requirement for downstream e-commerce applications. We propose a novel framework textbfBoosted Virtual Try-on (BVTON) to leverage the large-scale unpaired learning for high-fidelity try-on.
arXiv Detail & Related papers (2024-11-03T15:00:26Z)
IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions. IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE. We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z)
GenS: Generalizable Neural Surface Reconstruction from Multi-View Images [20.184657468900852]
GenS is an end-to-end generalizable neural surface reconstruction model. Our representation is more powerful, which can recover high-frequency details while maintaining global smoothness. Experiments on popular benchmarks show that our model can generalize well to new scenes.
arXiv Detail & Related papers (2024-06-04T17:13:10Z)
Time-Efficient and Identity-Consistent Virtual Try-On Using A Variant of Altered Diffusion Models [4.038493506169702]
This study emphasizes the challenges of preserving intricate texture details and distinctive features of the target person and the clothes in various scenarios. Various existing approaches are explored, highlighting the limitations and unresolved aspects. It then proposes a novel diffusion-based solution that addresses garment texture preservation and user identity retention during virtual try-on.
arXiv Detail & Related papers (2024-03-12T07:15:29Z)
PFDM: Parser-Free Virtual Try-on via Diffusion Model [28.202996582963184]
We propose a free virtual try-on method based on the diffusion model (PFDM) Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. Experiments demonstrate that our proposed PFDM can successfully handle complex images, and outperform both state-of-the-art-free and high-fidelity-based models.
arXiv Detail & Related papers (2024-02-05T14:32:57Z)
Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis. Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z)
Drivable Volumetric Avatars using Texel-Aligned Features [52.89305658071045]
Photo telepresence requires both high-fidelity body modeling and faithful driving to enable dynamically synthesized appearance. We propose an end-to-end framework that addresses two core challenges in modeling and driving full-body avatars of real people.
arXiv Detail & Related papers (2022-07-20T09:28:16Z)
Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image. We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation. Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.