Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks
- URL: http://arxiv.org/abs/2501.15891v1
- Date: Mon, 27 Jan 2025 09:33:23 GMT
- Title: Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks
- Authors: Hailong Guo, Bohan Zeng, Yiren Song, Wentao Zhang, Chuang Zhang, Jiaming Liu,
- Abstract summary: Image-based virtual try-on (VTON) aims to generate a virtual try-on result by transferring an input garment onto a target person's image.
The scarcity of paired garment-model data makes it challenging for existing methods to achieve high generalization and quality in VTON.
We propose Any2AnyTryon, which can generate try-on results based on different textual instructions and model garment images.
- Score: 31.461116368933165
- License:
- Abstract: Image-based virtual try-on (VTON) aims to generate a virtual try-on result by transferring an input garment onto a target person's image. However, the scarcity of paired garment-model data makes it challenging for existing methods to achieve high generalization and quality in VTON. Also, it limits the ability to generate mask-free try-ons. To tackle the data scarcity problem, approaches such as Stable Garment and MMTryon use a synthetic data strategy, effectively increasing the amount of paired data on the model side. However, existing methods are typically limited to performing specific try-on tasks and lack user-friendliness. To enhance the generalization and controllability of VTON generation, we propose Any2AnyTryon, which can generate try-on results based on different textual instructions and model garment images to meet various needs, eliminating the reliance on masks, poses, or other conditions. Specifically, we first construct the virtual try-on dataset LAION-Garment, the largest known open-source garment try-on dataset. Then, we introduce adaptive position embedding, which enables the model to generate satisfactory outfitted model images or garment images based on input images of different sizes and categories, significantly enhancing the generalization and controllability of VTON generation. In our experiments, we demonstrate the effectiveness of our Any2AnyTryon and compare it with existing methods. The results show that Any2AnyTryon enables flexible, controllable, and high-quality image-based virtual try-on generation.https://logn-2024.github.io/Any2anyTryonProjectPage/
Related papers
- Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling [20.072689146353348]
We introduce a garment extraction model that generates (human, synthetic garment) pairs from a single image of a clothed individual.
We also propose an Error-Aware Refinement-based Schr"odinger Bridge (EARSB) that surgically targets localized generation errors.
In user studies, our model is preferred by the users in an average of 59% of cases.
arXiv Detail & Related papers (2025-01-08T18:25:50Z) - Try-On-Adapter: A Simple and Flexible Try-On Paradigm [42.2724473500475]
Image-based virtual try-on, widely used in online shopping, aims to generate images of a naturally dressed person conditioned on certain garments.
Previous methods focus on masking certain parts of the original model's standing image, and then inpainting on masked areas to generate realistic images of the model wearing corresponding reference garments.
We propose Try-On-Adapter (TOA), an outpainting paradigm that differs from the existing inpainting paradigm.
arXiv Detail & Related papers (2024-11-15T13:35:58Z) - IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions.
IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE.
We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z) - WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models [132.77237314239025]
Video virtual try-on aims to generate realistic sequences that maintain garment identity and adapt to a person's pose and body shape in source videos.
Traditional image-based methods, relying on warping and blending, struggle with complex human movements and occlusions.
We reconceptualize video try-on as a process of generating videos conditioned on garment descriptions and human motion.
Our solution, WildVidFit, employs image-based controlled diffusion models for a streamlined, one-stage approach.
arXiv Detail & Related papers (2024-07-15T11:21:03Z) - AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario [50.62711489896909]
AnyFit surpasses all baselines on high-resolution benchmarks and real-world data by a large gap.
AnyFit's impressive performance on high-fidelity virtual try-ons in any scenario from any image, paves a new path for future research within the fashion community.
arXiv Detail & Related papers (2024-05-28T13:33:08Z) - MV-VTON: Multi-View Virtual Try-On with Diffusion Models [91.71150387151042]
The goal of image-based virtual try-on is to generate an image of the target person naturally wearing the given clothing.
Existing methods solely focus on the frontal try-on using the frontal clothing.
We introduce Multi-View Virtual Try-ON (MV-VTON), which aims to reconstruct the dressing results from multiple views using the given clothes.
arXiv Detail & Related papers (2024-04-26T12:27:57Z) - Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On [29.217423805933727]
Diffusion model-based approaches have recently become popular, as they are excellent at image synthesis tasks.
We propose an Texture-Preserving Diffusion (TPD) model for virtual try-on, which enhances the fidelity of the results.
Second, we propose a novel diffusion-based method that predicts a precise inpainting mask based on the person and reference garment images.
arXiv Detail & Related papers (2024-04-01T12:43:22Z) - Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment.
We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images.
We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z) - Single Stage Multi-Pose Virtual Try-On [119.95115739956661]
Multi-pose virtual try-on (MPVTON) aims to fit a target garment onto a person at a target pose.
MPVTON provides a better try-on experience, but is also more challenging due to the dual garment and pose editing objectives.
Existing methods adopt a pipeline comprising three disjoint modules including a target semantic layout prediction module, a coarse try-on image generator and a refinement try-on image generator.
In this paper, we propose a novel single stage model forTON. Key to our model is a parallel flow estimation module that predicts the flow fields for both person and garment images conditioned on
arXiv Detail & Related papers (2022-11-19T15:02:11Z) - Shape Controllable Virtual Try-on for Underwear Models [0.0]
We propose a Shape Controllable Virtual Try-On Network (SC-VTON) to dress clothing for underwear models.
SC-VTON integrates information of model and clothing to generate warped clothing image.
Our method can generate high-resolution results with detailed textures.
arXiv Detail & Related papers (2021-07-28T04:01:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.