Learning Semantic Person Image Generation by Region-Adaptive
Normalization
- URL: http://arxiv.org/abs/2104.06650v1
- Date: Wed, 14 Apr 2021 06:51:37 GMT
- Title: Learning Semantic Person Image Generation by Region-Adaptive
Normalization
- Authors: Zhengyao Lv, Xiaoming Li, Xin Li, Fu Li, Tianwei Lin, Dongliang He and
Wangmeng Zuo
- Abstract summary: We propose a new two-stage framework to handle the pose and appearance translation.
In the first stage, we predict the target semantic parsing maps to eliminate the difficulties of pose transfer.
In the second stage, we suggest a new person image generation method by incorporating the region-adaptive normalization.
- Score: 81.52223606284443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human pose transfer has received great attention due to its wide
applications, yet is still a challenging task that is not well solved. Recent
works have achieved great success to transfer the person image from the source
to the target pose. However, most of them cannot well capture the semantic
appearance, resulting in inconsistent and less realistic textures on the
reconstructed results. To address this issue, we propose a new two-stage
framework to handle the pose and appearance translation. In the first stage, we
predict the target semantic parsing maps to eliminate the difficulties of pose
transfer and further benefit the latter translation of per-region appearance
style. In the second one, with the predicted target semantic maps, we suggest a
new person image generation method by incorporating the region-adaptive
normalization, in which it takes the per-region styles to guide the target
appearance generation. Extensive experiments show that our proposed SPGNet can
generate more semantic, consistent, and photo-realistic results and perform
favorably against the state of the art methods in terms of quantitative and
qualitative evaluation. The source code and model are available at
https://github.com/cszy98/SPGNet.git.
Related papers
- Conditional Score Guidance for Text-Driven Image-to-Image Translation [52.73564644268749]
We present a novel algorithm for text-driven image-to-image translation based on a pretrained text-to-image diffusion model.
Our method aims to generate a target image by selectively editing the regions of interest in a source image.
arXiv Detail & Related papers (2023-05-29T10:48:34Z) - Facial Expression Translation using Landmark Guided GANs [84.64650795005649]
We propose a powerful Landmark guided Generative Adversarial Network (LandmarkGAN) for the facial expression-to-expression translation.
The proposed LandmarkGAN achieves better results compared with state-of-the-art approaches only using a single image.
arXiv Detail & Related papers (2022-09-05T20:52:42Z) - TIPS: Text-Induced Pose Synthesis [24.317541784957285]
In computer vision, human pose synthesis and transfer deal with probabilistic image generation of a person in a previously unseen pose.
We first present the shortcomings of current pose transfer algorithms and then propose a novel text-based pose transfer technique to address those issues.
The proposed method generates promising results with significant qualitative and quantitative scores in our experiments.
arXiv Detail & Related papers (2022-07-24T11:14:46Z) - Boosting Image Outpainting with Semantic Layout Prediction [18.819765707811904]
We train a GAN to extend regions in semantic segmentation domain instead of image domain.
Another GAN model is trained to synthesize real images based on the extended semantic layouts.
Our approach can handle semantic clues more easily and hence works better in complex scenarios.
arXiv Detail & Related papers (2021-10-18T13:09:31Z) - Context-Aware Image Inpainting with Learned Semantic Priors [100.99543516733341]
We introduce pretext tasks that are semantically meaningful to estimating the missing contents.
We propose a context-aware image inpainting model, which adaptively integrates global semantics and local features.
arXiv Detail & Related papers (2021-06-14T08:09:43Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - PISE: Person Image Synthesis and Editing with Decoupled GAN [64.70360318367943]
We propose PISE, a novel two-stage generative model for Person Image Synthesis and Editing.
For human pose transfer, we first synthesize a human parsing map aligned with the target pose to represent the shape of clothing.
To decouple the shape and style of clothing, we propose joint global and local per-region encoding and normalization.
arXiv Detail & Related papers (2021-03-06T04:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.