Combining Attention with Flow for Person Image Synthesis
- URL: http://arxiv.org/abs/2108.01823v1
- Date: Wed, 4 Aug 2021 03:05:39 GMT
- Title: Combining Attention with Flow for Person Image Synthesis
- Authors: Yurui Ren, Yubo Wu, Thomas H. Li, Shan Liu, Ge Li
- Abstract summary: We propose a novel model by combining the attention operation with the flow-based operation.
Our model not only takes the advantage of the attention operation to generate accurate target structures but also uses the flow-based operation to sample realistic source textures.
- Score: 55.670135403481275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pose-guided person image synthesis aims to synthesize person images by
transforming reference images into target poses. In this paper, we observe that
the commonly used spatial transformation blocks have complementary advantages.
We propose a novel model by combining the attention operation with the
flow-based operation. Our model not only takes the advantage of the attention
operation to generate accurate target structures but also uses the flow-based
operation to sample realistic source textures. Both objective and subjective
experiments demonstrate the superiority of our model. Meanwhile, comprehensive
ablation studies verify our hypotheses and show the efficacy of the proposed
modules. Besides, additional experiments on the portrait image editing task
demonstrate the versatility of the proposed combination.
Related papers
- Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On [29.217423805933727]
Diffusion model-based approaches have recently become popular, as they are excellent at image synthesis tasks.
We propose an Texture-Preserving Diffusion (TPD) model for virtual try-on, which enhances the fidelity of the results.
Second, we propose a novel diffusion-based method that predicts a precise inpainting mask based on the person and reference garment images.
arXiv Detail & Related papers (2024-04-01T12:43:22Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image.
We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation.
Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z) - Optical Flow Estimation from a Single Motion-blurred Image [66.2061278123057]
Motion blur in an image may have practical interests in fundamental computer vision problems.
We propose a novel framework to estimate optical flow from a single motion-blurred image in an end-to-end manner.
arXiv Detail & Related papers (2021-03-04T12:45:18Z) - Region-adaptive Texture Enhancement for Detailed Person Image Synthesis [86.69934638569815]
RATE-Net is a novel framework for synthesizing person images with sharp texture details.
The proposed framework leverages an additional texture enhancing module to extract appearance information from the source image.
Experiments conducted on DeepFashion benchmark dataset have demonstrated the superiority of our framework compared with existing networks.
arXiv Detail & Related papers (2020-05-26T02:33:21Z) - Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision [83.33283892171562]
Example-guided image synthesis has recently been attempted to synthesize an image from a semantic label map and an exemplary image.
In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically different from the given label map.
We propose an end-to-end network for joint global and local feature alignment and synthesis.
arXiv Detail & Related papers (2020-04-18T18:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.