Related papers: Toward Accurate and Realistic Virtual Try-on Through Shape Matching and Multiple Warps

Toward Accurate and Realistic Virtual Try-on Through Shape Matching and Multiple Warps

URL: http://arxiv.org/abs/2003.10817v2
Date: Fri, 27 Mar 2020 01:15:54 GMT
Title: Toward Accurate and Realistic Virtual Try-on Through Shape Matching and Multiple Warps
Authors: Kedan Li, Min Jin Chong, Jingen Liu, David Forsyth
Abstract summary: A virtual try-on method takes a product image and an image of a model and produces an image of the model wearing the product. Most methods essentially compute warps from the product image to the model image and combine using image generation methods. This paper uses quantitative evaluation on a challenging, novel dataset to demonstrate that (a) for any warping method, one can choose target models automatically to improve results, and (b) learning multiple coordinated specialized warpers offers further improvements on results.
Score: 25.157142707318304
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: A virtual try-on method takes a product image and an image of a model and produces an image of the model wearing the product. Most methods essentially compute warps from the product image to the model image and combine using image generation methods. However, obtaining a realistic image is challenging because the kinematics of garments is complex and because outline, texture, and shading cues in the image reveal errors to human viewers. The garment must have appropriate drapes; texture must be warped to be consistent with the shape of a draped garment; small details (buttons, collars, lapels, pockets, etc.) must be placed appropriately on the garment, and so on. Evaluation is particularly difficult and is usually qualitative. This paper uses quantitative evaluation on a challenging, novel dataset to demonstrate that (a) for any warping method, one can choose target models automatically to improve results, and (b) learning multiple coordinated specialized warpers offers further improvements on results. Target models are chosen by a learned embedding procedure that predicts a representation of the products the model is wearing. This prediction is used to match products to models. Specialized warpers are trained by a method that encourages a second warper to perform well in locations where the first works poorly. The warps are then combined using a U-Net. Qualitative evaluation confirms that these improvements are wholesale over outline, texture shading, and garment details.

Related papers

PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework. We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences. Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z)
Information Theoretic Text-to-Image Alignment [49.396917351264655]
We present a novel method that relies on an information-theoretic alignment measure to steer image generation. Our method is on-par or superior to the state-of-the-art, yet requires nothing but a pre-trained denoising network to estimate MI.
arXiv Detail & Related papers (2024-05-31T12:20:02Z)
Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment. We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images. We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z)
A Two-stage Personalized Virtual Try-on Framework with Shape Control and Texture Guidance [7.302929117437442]
This paper proposes a brand new personalized virtual try-on model (PE-VITON), which uses the two stages (shape control and texture guidance) to decouple the clothing attributes. The proposed model can effectively solve the problems of weak reduction of clothing folds, poor generation effect under complex human posture, blurred edges of clothing, and unclear texture styles in traditional try-on methods.
arXiv Detail & Related papers (2023-12-24T13:32:55Z)
StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On [35.227896906556026]
Given a clothing image and a person image, an image-based virtual try-on aims to generate a customized image that appears natural and accurately reflects the characteristics of the clothing image. In this work, we aim to expand the applicability of the pre-trained diffusion model so that it can be utilized independently for the virtual try-on task. Our proposed zero cross-attention blocks not only preserve the clothing details by learning the semantic correspondence but also generate high-fidelity images by utilizing the inherent knowledge of the pre-trained model in the warping process.
arXiv Detail & Related papers (2023-12-04T08:27:59Z)
Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis. Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z)
Weakly Supervised High-Fidelity Clothing Model Generation [67.32235668920192]
We propose a cheap yet scalable weakly-supervised method called Deep Generative Projection (DGP) to address this specific scenario. We show that projecting the rough alignment of clothing and body onto the StyleGAN space can yield photo-realistic wearing results.
arXiv Detail & Related papers (2021-12-14T07:15:15Z)
Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic [72.60554897161948]
Recent text-to-image matching models apply contrastive learning to large corpora of uncurated pairs of images and sentences. In this work, we repurpose such models to generate a descriptive text given an image at inference time. The resulting captions are much less restrictive than those obtained by supervised captioning methods.
arXiv Detail & Related papers (2021-11-29T11:01:49Z)
Arbitrary Virtual Try-On Network: Characteristics Preservation and Trade-off between Body and Clothing [85.74977256940855]
We propose an Arbitrary Virtual Try-On Network (AVTON) for all-type clothes. AVTON can synthesize realistic try-on images by preserving and trading off characteristics of the target clothes and the reference person. Our approach can achieve better performance compared with the state-of-the-art virtual try-on methods.
arXiv Detail & Related papers (2021-11-24T08:59:56Z)
Face sketch to photo translation using generative adversarial networks [1.0312968200748118]
We use a pre-trained face photo generating model to synthesize high-quality natural face photos. We train a network to map the facial features extracted from the input sketch to a vector in the latent space of the face generating model. The proposed model achieved 0.655 in the SSIM index and 97.59% rank-1 face recognition rate.
arXiv Detail & Related papers (2021-10-23T20:01:20Z)
Toward Accurate and Realistic Outfits Visualization with Attention to Details [10.655149697873716]
We propose Outfit Visualization Net to capture important visual details necessary for commercial applications. OVNet consists of 1) a semantic layout generator and 2) an image generation pipeline using multiple coordinated warps. An interactive interface powered by this method has been deployed on fashion e-commerce websites and received overwhelmingly positive feedback.
arXiv Detail & Related papers (2021-06-11T19:53:34Z)
SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On [14.198545992098309]
SieveNet is a framework for robust image-based virtual try-on. We introduce a multi-stage coarse-to-fine warping network to better model fine-grained intricacies. We also introduce a try-on cloth conditioned segmentation mask prior to improve the texture transfer network.
arXiv Detail & Related papers (2020-01-17T12:33:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.