Related papers: MGT: Extending Virtual Try-Off to Multi-Garment Scenarios

MGT: Extending Virtual Try-Off to Multi-Garment Scenarios

URL: http://arxiv.org/abs/2504.13078v2
Date: Fri, 11 Jul 2025 08:51:16 GMT
Title: MGT: Extending Virtual Try-Off to Multi-Garment Scenarios
Authors: Riza Velioglu, Petra Bevandic, Robin Chan, Barbara Hammer,
Abstract summary: We introduce Multi-Garment TryOffDiff (MGT), a diffusion-based VTOFF model capable of handling diverse garment types.<n>MGT incorporates class-specific embeddings, achieving state-of-the-art VTOFF results on VITON-HD and competitive performance on DressCode.
Score: 8.158200403139196
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computer vision is transforming fashion industry through Virtual Try-On (VTON) and Virtual Try-Off (VTOFF). VTON generates images of a person in a specified garment using a target photo and a standardized garment image, while a more challenging variant, Person-to-Person Virtual Try-On (p2p-VTON), uses a photo of another person wearing the garment. VTOFF, in contrast, extracts standardized garment images from photos of clothed individuals. We introduce Multi-Garment TryOffDiff (MGT), a diffusion-based VTOFF model capable of handling diverse garment types, including upper-body, lower-body, and dresses. MGT builds on a latent diffusion architecture with SigLIP-based image conditioning to capture garment characteristics such as shape, texture, and pattern. To address garment diversity, MGT incorporates class-specific embeddings, achieving state-of-the-art VTOFF results on VITON-HD and competitive performance on DressCode. When paired with VTON models, it further enhances p2p-VTON by reducing unwanted attribute transfer, such as skin tone, ensuring preservation of person-specific characteristics. Demo, code, and models are available at: https://rizavelioglu.github.io/tryoffdiff/

Related papers

OmniVTON: Training-Free Universal Virtual Try-On [53.31945401098557]
Image-based Virtual Try-On (VTON) techniques rely on either supervised in-shop approaches, or unsupervised in-the-wild methods, which improve adaptability but remain constrained by data biases and limited universality.<n>We propose OmniVTON, the first training-free universal VTON framework that decouples garment and pose conditioning to achieve both texture fidelity and pose consistency across diverse settings.
arXiv Detail & Related papers (2025-07-20T16:37:53Z)
Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals [76.96387718150542]
We present Text-Enhanced MUlti-category Virtual Try-Off (TEMU-VTOFF)<n>Our architecture is designed to receive garment information from multiple modalities like images, text, and masks to work in a multi-category setting.<n> Experiments on VITON-HD and Dress Code datasets show that TEMU-VTOFF sets a new state-of-the-art on the VTOFF task.
arXiv Detail & Related papers (2025-05-27T11:47:51Z)
Limb-Aware Virtual Try-On Network with Progressive Clothing Warping [64.84181064722084]
Image-based virtual try-on aims to transfer an in-shop clothing image to a person image. Most existing methods adopt a single global deformation to perform clothing warping directly. We propose Limb-aware Virtual Try-on Network named PL-VTON, which performs fine-grained clothing warping progressively.
arXiv Detail & Related papers (2025-03-18T09:52:41Z)
MFP-VTON: Enhancing Mask-Free Person-to-Person Virtual Try-On via Diffusion Transformer [5.844515709826269]
Garment-to-person virtual try-on (VTON) aims to generate fitting images of a person wearing a reference garment. To improve ease of use, we propose a Mask-Free framework for Person-to-Person VTON. Our model excels in both person-to-person and garment-to-person VTON tasks, generating high-fidelity fitting images.
arXiv Detail & Related papers (2025-02-03T18:56:24Z)
Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks [31.461116368933165]
Image-based virtual try-on (VTON) aims to generate a virtual try-on result by transferring an input garment onto a target person's image. The scarcity of paired garment-model data makes it challenging for existing methods to achieve high generalization and quality in VTON. We propose Any2AnyTryon, which can generate try-on results based on different textual instructions and model garment images.
arXiv Detail & Related papers (2025-01-27T09:33:23Z)
TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models [8.158200403139196]
This paper introduces Virtual Try-Off (VTOFF), a novel task focused on generating standardized garment images from single photos of clothed individuals.<n>We present TryOffDiff, a model that adapts Stable Diffusion with SigLIP-based visual conditioning to ensure high fidelity and detail retention.<n>Our results highlight the potential of VTOFF to enhance product imagery in e-commerce applications, advance generative model evaluation, and inspire future work on high-fidelity reconstruction.
arXiv Detail & Related papers (2024-11-27T13:53:09Z)
Improving Virtual Try-On with Garment-focused Diffusion Models [91.95830983115474]
Diffusion models have led to the revolutionizing of generative modeling in numerous image synthesis tasks. We shape a new Diffusion model, namely GarDiff, which triggers the garment-focused diffusion process. Experiments on VITON-HD and DressCode datasets demonstrate the superiority of our GarDiff when compared to state-of-the-art VTON approaches.
arXiv Detail & Related papers (2024-09-12T17:55:11Z)
OutfitAnyone: Ultra-high Quality Virtual Try-On for Any Clothing and Any Person [38.69239957207417]
OutfitAnyone generates high-fidelity and detail-consistent images for virtual clothing trials. It distinguishes itself with scalability-ulating factors such as pose, body shape and broad applicability. OutfitAnyone's performance in diverse scenarios underscores its utility and readiness for real-world deployment.
arXiv Detail & Related papers (2024-07-23T07:04:42Z)
IMAGDressing-v1: Customizable Virtual Dressing [58.44155202253754]
IMAGDressing-v1 is a virtual dressing task that generates freely editable human images with fixed garments and optional conditions. IMAGDressing-v1 incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE. We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet.
arXiv Detail & Related papers (2024-07-17T16:26:30Z)
M&M VTO: Multi-Garment Virtual Try-On and Editing [31.45715245587691]
M&M VTO is a mix and match virtual try-on method that takes as input multiple garment images, text description for garment layout and an image of a person. An example input includes: an image of a shirt, an image of a pair of pants, "rolled sleeves, shirt tucked in", and an image of a person. The output is a visualization of how those garments (in the desired layout) would look like on the given person.
arXiv Detail & Related papers (2024-06-06T22:46:37Z)
MV-VTON: Multi-View Virtual Try-On with Diffusion Models [91.71150387151042]
The goal of image-based virtual try-on is to generate an image of the target person naturally wearing the given clothing.<n>Existing methods solely focus on the frontal try-on using the frontal clothing.<n>We introduce Multi-View Virtual Try-ON (MV-VTON), which aims to reconstruct the dressing results from multiple views using the given clothes.
arXiv Detail & Related papers (2024-04-26T12:27:57Z)
Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment. We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images. We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z)
Learning Garment DensePose for Robust Warping in Virtual Try-On [72.13052519560462]
We propose a robust warping method for virtual try-on based on a learned garment DensePose. Our method achieves the state-of-the-art equivalent on virtual try-on benchmarks.
arXiv Detail & Related papers (2023-03-30T20:02:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.