Towards Photo-Realistic Virtual Try-On by Adaptively
Generating$\leftrightarrow$Preserving Image Content
- URL: http://arxiv.org/abs/2003.05863v1
- Date: Thu, 12 Mar 2020 15:55:39 GMT
- Title: Towards Photo-Realistic Virtual Try-On by Adaptively
Generating$\leftrightarrow$Preserving Image Content
- Authors: Han Yang, Ruimao Zhang, Xiaobao Guo, Wei Liu, Wangmeng Zuo, Ping Luo
- Abstract summary: We propose a novel visual try-on network, namely Adaptive Content Generating and Preserving Network (ACGPN)
ACGPN first predicts semantic layout of the reference image that will be changed after try-on.
Second, a clothes warping module warps clothing images according to the generated semantic layout.
Third, an inpainting module for content fusion integrates all information (e.g. reference image, semantic layout, warped clothes) to adaptively produce each semantic part of human body.
- Score: 85.24260811659094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image visual try-on aims at transferring a target clothing image onto a
reference person, and has become a hot topic in recent years. Prior arts
usually focus on preserving the character of a clothing image (e.g. texture,
logo, embroidery) when warping it to arbitrary human pose. However, it remains
a big challenge to generate photo-realistic try-on images when large occlusions
and human poses are presented in the reference person. To address this issue,
we propose a novel visual try-on network, namely Adaptive Content Generating
and Preserving Network (ACGPN). In particular, ACGPN first predicts semantic
layout of the reference image that will be changed after try-on (e.g. long
sleeve shirt$\rightarrow$arm, arm$\rightarrow$jacket), and then determines
whether its image content needs to be generated or preserved according to the
predicted semantic layout, leading to photo-realistic try-on and rich clothing
details. ACGPN generally involves three major modules. First, a semantic layout
generation module utilizes semantic segmentation of the reference image to
progressively predict the desired semantic layout after try-on. Second, a
clothes warping module warps clothing images according to the generated
semantic layout, where a second-order difference constraint is introduced to
stabilize the warping process during training. Third, an inpainting module for
content fusion integrates all information (e.g. reference image, semantic
layout, warped clothes) to adaptively produce each semantic part of human body.
In comparison to the state-of-the-art methods, ACGPN can generate
photo-realistic images with much better perceptual quality and richer
fine-details.
Related papers
- Decoupled Textual Embeddings for Customized Image Generation [62.98933630971543]
Customized text-to-image generation aims to learn user-specified concepts with a few images.
Existing methods usually suffer from overfitting issues and entangle the subject-unrelated information with the learned concept.
We propose the DETEX, a novel approach that learns the disentangled concept embedding for flexible customized text-to-image generation.
arXiv Detail & Related papers (2023-12-19T03:32:10Z) - StableVITON: Learning Semantic Correspondence with Latent Diffusion
Model for Virtual Try-On [35.227896906556026]
Given a clothing image and a person image, an image-based virtual try-on aims to generate a customized image that appears natural and accurately reflects the characteristics of the clothing image.
In this work, we aim to expand the applicability of the pre-trained diffusion model so that it can be utilized independently for the virtual try-on task.
Our proposed zero cross-attention blocks not only preserve the clothing details by learning the semantic correspondence but also generate high-fidelity images by utilizing the inherent knowledge of the pre-trained model in the warping process.
arXiv Detail & Related papers (2023-12-04T08:27:59Z) - DiffFashion: Reference-based Fashion Design with Structure-aware
Transfer by Diffusion Models [4.918209527904503]
We focus on a new fashion design task, where we aim to transfer a reference appearance image onto a clothing image.
It is a challenging task since there are no reference images available for the newly designed output fashion images.
We present a novel diffusion model-based unsupervised structure-aware transfer method to semantically generate new clothes.
arXiv Detail & Related papers (2023-02-14T04:45:44Z) - OccluMix: Towards De-Occlusion Virtual Try-on by Semantically-Guided
Mixup [79.3118064406151]
Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes)
Prior methods successfully preserve the character of clothing images.
Occlusion remains a pernicious effect for realistic virtual try-on.
arXiv Detail & Related papers (2023-01-03T06:29:11Z) - Style-Based Global Appearance Flow for Virtual Try-On [119.95115739956661]
A novel global appearance flow estimation model is proposed in this work.
Experiment results on a popular virtual try-on benchmark show that our method achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-04-03T10:58:04Z) - Arbitrary Virtual Try-On Network: Characteristics Preservation and
Trade-off between Body and Clothing [85.74977256940855]
We propose an Arbitrary Virtual Try-On Network (AVTON) for all-type clothes.
AVTON can synthesize realistic try-on images by preserving and trading off characteristics of the target clothes and the reference person.
Our approach can achieve better performance compared with the state-of-the-art virtual try-on methods.
arXiv Detail & Related papers (2021-11-24T08:59:56Z) - SPG-VTON: Semantic Prediction Guidance for Multi-pose Virtual Try-on [27.870740623131816]
Image-based virtual try-on is challenging in fitting a target in-shop clothes into a reference person under diverse human poses.
We propose an end-to-end Semantic Prediction Guidance multi-pose Virtual Try-On Network (SPG-VTON)
We evaluate the proposed method on the most massive multi-pose dataset (MPV) and the DeepFashion dataset.
arXiv Detail & Related papers (2021-08-03T15:40:50Z) - Spatial Content Alignment For Pose Transfer [13.018067816407923]
We propose a novel framework to enhance the content consistency of garment textures and the details of human characteristics.
We first alleviate the spatial misalignment by transferring the edge content to the target pose in advance.
Secondly, we introduce a new Content-Style DeBlk which can progressively synthesize photo-realistic person images.
arXiv Detail & Related papers (2021-03-31T06:10:29Z) - PISE: Person Image Synthesis and Editing with Decoupled GAN [64.70360318367943]
We propose PISE, a novel two-stage generative model for Person Image Synthesis and Editing.
For human pose transfer, we first synthesize a human parsing map aligned with the target pose to represent the shape of clothing.
To decouple the shape and style of clothing, we propose joint global and local per-region encoding and normalization.
arXiv Detail & Related papers (2021-03-06T04:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.