Learning Implicit Features with Flow Infused Attention for Realistic Virtual Try-On
- URL: http://arxiv.org/abs/2412.11435v1
- Date: Mon, 16 Dec 2024 04:23:33 GMT
- Title: Learning Implicit Features with Flow Infused Attention for Realistic Virtual Try-On
- Authors: Delong Zhang, Qiwei Huang, Yuanliu Liu, Yang Sun, Wei-Shi Zheng, Pengfei Xiong, Wei Zhang,
- Abstract summary: We propose FIA-VTON, which leverages the implicit warp feature by adopting a Flow Infused Attention module on virtual try-on.
Experimental results on the VTON-HD and DressCode dataset significantly outperform state-of-the-art methods.
- Score: 32.87047008072487
- License:
- Abstract: Image-based virtual try-on is challenging since the generated image should fit the garment to model images in various poses and keep the characteristics and details of the garment simultaneously. A popular research stream warps the garment image firstly to reduce the burden of the generation stage, which relies highly on the performance of the warping module. Other methods without explicit warping often lack sufficient guidance to fit the garment to the model images. In this paper, we propose FIA-VTON, which leverages the implicit warp feature by adopting a Flow Infused Attention module on virtual try-on. The dense warp flow map is projected as indirect guidance attention to enhance the feature map warping in the generation process implicitly, which is less sensitive to the warping estimation accuracy than an explicit warp of the garment image. To further enhance implicit warp guidance, we incorporate high-level spatial attention to complement the dense warp. Experimental results on the VTON-HD and DressCode dataset significantly outperform state-of-the-art methods, demonstrating that FIA-VTON is effective and robust for virtual try-on.
Related papers
- ODPG: Outfitting Diffusion with Pose Guided Condition [2.5602836891933074]
VTON technology allows users to visualize how clothes would look on them without physically trying them on.
Traditional VTON methods, often using Geneversarative Adrial Networks (GANs) and Diffusion models, face challenges in achieving high realism and handling dynamic poses.
This paper introduces Outfitting Diffusion with Pose Guided Condition (ODPG), a novel approach that leverages a latent diffusion model with multiple conditioning inputs during the denoising process.
arXiv Detail & Related papers (2025-01-12T10:30:27Z) - Learning Flow Fields in Attention for Controllable Person Image Generation [59.10843756343987]
Controllable person image generation aims to generate a person image conditioned on reference images.
We propose learning flow fields in attention (Leffa), which explicitly guides the target query to attend to the correct reference key.
Leffa achieves state-of-the-art performance in controlling appearance (virtual try-on) and pose (pose transfer), significantly reducing fine-grained detail distortion.
arXiv Detail & Related papers (2024-12-11T15:51:14Z) - Improving Virtual Try-On with Garment-focused Diffusion Models [91.95830983115474]
Diffusion models have led to the revolutionizing of generative modeling in numerous image synthesis tasks.
We shape a new Diffusion model, namely GarDiff, which triggers the garment-focused diffusion process.
Experiments on VITON-HD and DressCode datasets demonstrate the superiority of our GarDiff when compared to state-of-the-art VTON approaches.
arXiv Detail & Related papers (2024-09-12T17:55:11Z) - GraVITON: Graph based garment warping with attention guided inversion for Virtual-tryon [5.790630195329777]
We introduce a novel graph based warping technique which emphasizes the value of context in garment flow.
Our method, validated on VITON-HD and Dresscode datasets, showcases substantial improvement in garment warping, texture preservation, and overall realism.
arXiv Detail & Related papers (2024-06-04T10:29:18Z) - Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment.
We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images.
We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z) - OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable
Virtual Try-on [7.46772222515689]
OOTDiffusion is a novel network architecture for realistic and controllable image-based virtual try-on.
We leverage the power of pretrained latent diffusion models, designing an outfitting UNet to learn the garment detail features.
Our experiments on the VITON-HD and Dress Code datasets demonstrate that OOTDiffusion efficiently generates high-quality try-on results.
arXiv Detail & Related papers (2024-03-04T07:17:44Z) - WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual
Try-on [81.15988741258683]
Image-based Virtual Try-On (VITON) aims to transfer an in-shop garment image onto a target person.
Current methods often overlook the synthesis quality around the garment-skin boundary and realistic effects like wrinkles and shadows on the warped garments.
We propose WarpDiffusion, which bridges the warping-based and diffusion-based paradigms via a novel informative and local garment feature attention mechanism.
arXiv Detail & Related papers (2023-12-06T18:34:32Z) - Taming the Power of Diffusion Models for High-Quality Virtual Try-On
with Appearance Flow [24.187109053871833]
Virtual try-on is a critical image synthesis task that aims to transfer clothes from one image to another while preserving the details of both humans and clothes.
We propose an exemplar-based inpainting approach that leverages a warping module to guide the diffusion model's generation effectively.
Our approach, namely Diffusion-based Conditional Inpainting for Virtual Try-ON (DCI-VTON), effectively utilizes the power of the diffusion model.
arXiv Detail & Related papers (2023-08-11T12:23:09Z) - Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image.
We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation.
Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z) - Style-Based Global Appearance Flow for Virtual Try-On [119.95115739956661]
A novel global appearance flow estimation model is proposed in this work.
Experiment results on a popular virtual try-on benchmark show that our method achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-04-03T10:58:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.