High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled
Conditions
- URL: http://arxiv.org/abs/2206.14180v1
- Date: Tue, 28 Jun 2022 17:47:53 GMT
- Title: High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled
Conditions
- Authors: Sangyun Lee, Gyojung Gu, Sunghyun Park, Seunghwan Choi, Jaegul Choo
- Abstract summary: Image-based virtual try-on aims to synthesize an image of a person wearing a given clothing item.
We propose a novel try-on condition generator as a unified module of the two stages (i.e., warping and segmentation generation stages)
A newly proposed feature fusion block in the condition generator implements the information exchange, and the condition generator does not create any misalignment or pixel-squeezing artifacts.
- Score: 29.236895355922496
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Image-based virtual try-on aims to synthesize an image of a person wearing a
given clothing item. To solve the task, the existing methods warp the clothing
item to fit the person's body and generate the segmentation map of the person
wearing the item, before fusing the item with the person. However, when the
warping and the segmentation generation stages operate individually without
information exchange, the misalignment between the warped clothes and the
segmentation map occurs, which leads to the artifacts in the final image. The
information disconnection also causes excessive warping near the clothing
regions occluded by the body parts, so called pixel-squeezing artifacts. To
settle the issues, we propose a novel try-on condition generator as a unified
module of the two stages (i.e., warping and segmentation generation stages). A
newly proposed feature fusion block in the condition generator implements the
information exchange, and the condition generator does not create any
misalignment or pixel-squeezing artifacts. We also introduce discriminator
rejection that filters out the incorrect segmentation map predictions and
assures the performance of virtual try-on frameworks. Experiments on a
high-resolution dataset demonstrate that our model successfully handles the
misalignment and the occlusion, and significantly outperforms the baselines.
Code is available at https://github.com/sangyun884/HR-VITON.
Related papers
- GraVITON: Graph based garment warping with attention guided inversion for Virtual-tryon [5.790630195329777]
We introduce a novel graph based warping technique which emphasizes the value of context in garment flow.
Our method, validated on VITON-HD and Dresscode datasets, showcases substantial improvement in garment warping, texture preservation, and overall realism.
arXiv Detail & Related papers (2024-06-04T10:29:18Z) - OccluMix: Towards De-Occlusion Virtual Try-on by Semantically-Guided
Mixup [79.3118064406151]
Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes)
Prior methods successfully preserve the character of clothing images.
Occlusion remains a pernicious effect for realistic virtual try-on.
arXiv Detail & Related papers (2023-01-03T06:29:11Z) - Foreground-Background Separation through Concept Distillation from
Generative Image Foundation Models [6.408114351192012]
We present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions.
We show results on the task of segmenting four different objects (humans, dogs, cars, birds) and a use case scenario in medical image analysis.
arXiv Detail & Related papers (2022-12-29T13:51:54Z) - DisPositioNet: Disentangled Pose and Identity in Semantic Image
Manipulation [83.51882381294357]
DisPositioNet is a model that learns a disentangled representation for each object for the task of image manipulation using scene graphs.
Our framework enables the disentanglement of the variational latent embeddings as well as the feature representation in the graph.
arXiv Detail & Related papers (2022-11-10T11:47:37Z) - Self-Supervised Video Object Segmentation via Cutout Prediction and
Tagging [117.73967303377381]
We propose a novel self-supervised Video Object (VOS) approach that strives to achieve better object-background discriminability.
Our approach is based on a discriminative learning loss formulation that takes into account both object and background information.
Our proposed approach, CT-VOS, achieves state-of-the-art results on two challenging benchmarks: DAVIS-2017 and Youtube-VOS.
arXiv Detail & Related papers (2022-04-22T17:53:27Z) - Arbitrary Virtual Try-On Network: Characteristics Preservation and
Trade-off between Body and Clothing [85.74977256940855]
We propose an Arbitrary Virtual Try-On Network (AVTON) for all-type clothes.
AVTON can synthesize realistic try-on images by preserving and trading off characteristics of the target clothes and the reference person.
Our approach can achieve better performance compared with the state-of-the-art virtual try-on methods.
arXiv Detail & Related papers (2021-11-24T08:59:56Z) - SPG-VTON: Semantic Prediction Guidance for Multi-pose Virtual Try-on [27.870740623131816]
Image-based virtual try-on is challenging in fitting a target in-shop clothes into a reference person under diverse human poses.
We propose an end-to-end Semantic Prediction Guidance multi-pose Virtual Try-On Network (SPG-VTON)
We evaluate the proposed method on the most massive multi-pose dataset (MPV) and the DeepFashion dataset.
arXiv Detail & Related papers (2021-08-03T15:40:50Z) - VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware
Normalization [18.347532903864597]
We propose a novel virtual try-on method called VITON-HD that successfully synthesizes 1024x768 virtual try-on images.
We show that VITON-HD highly sur-passes the baselines in terms of synthesized image quality both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-03-31T07:52:41Z) - PISE: Person Image Synthesis and Editing with Decoupled GAN [64.70360318367943]
We propose PISE, a novel two-stage generative model for Person Image Synthesis and Editing.
For human pose transfer, we first synthesize a human parsing map aligned with the target pose to represent the shape of clothing.
To decouple the shape and style of clothing, we propose joint global and local per-region encoding and normalization.
arXiv Detail & Related papers (2021-03-06T04:32:06Z) - Semantic Editing On Segmentation Map Via Multi-Expansion Loss [98.1131339357174]
This paper aims to improve quality of edited segmentation map conditioned on semantic inputs.
We propose MExGAN for semantic editing on segmentation map, which uses a novel Multi-Expansion (MEx) loss.
Experiments on semantic editing on segmentation map and natural image inpainting show competitive results on four datasets.
arXiv Detail & Related papers (2020-10-16T03:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.