Two-Stream Appearance Transfer Network for Person Image Generation
- URL: http://arxiv.org/abs/2011.04181v1
- Date: Mon, 9 Nov 2020 04:21:02 GMT
- Title: Two-Stream Appearance Transfer Network for Person Image Generation
- Authors: Chengkang Shen, Peiyan Wang and Wei Tang
- Abstract summary: generative adversarial networks (GANs) widely used for image generation and translation rely on spatially local and translation equivariant operators.
This paper introduces a novel two-stream appearance transfer network (2s-ATN) to address this challenge.
It is a multi-stage architecture consisting of a source stream and a target stream. Each stage features an appearance transfer module and several two-stream feature fusion modules.
- Score: 16.681839931864886
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pose guided person image generation means to generate a photo-realistic
person image conditioned on an input person image and a desired pose. This task
requires spatial manipulation of the source image according to the target pose.
However, the generative adversarial networks (GANs) widely used for image
generation and translation rely on spatially local and translation equivariant
operators, i.e., convolution, pooling and unpooling, which cannot handle large
image deformation. This paper introduces a novel two-stream appearance transfer
network (2s-ATN) to address this challenge. It is a multi-stage architecture
consisting of a source stream and a target stream. Each stage features an
appearance transfer module and several two-stream feature fusion modules. The
former finds the dense correspondence between the two-stream feature maps and
then transfers the appearance information from the source stream to the target
stream. The latter exchange local information between the two streams and
supplement the non-local appearance transfer. Both quantitative and qualitative
results indicate the proposed 2s-ATN can effectively handle large spatial
deformation and occlusion while retaining the appearance details. It
outperforms prior states of the art on two widely used benchmarks.
Related papers
- Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation [41.341693150031546]
We present a new multi-modal face image generation method that converts a text prompt and a visual input, such as a semantic mask or map, into a photo-realistic face image.
We present a simple mapping and a style modulation network to link two models and convert meaningful representations in feature maps and attention maps into latent codes.
Our proposed network produces realistic 2D, multi-view, and stylized face images, which align well with inputs.
arXiv Detail & Related papers (2024-05-07T14:33:40Z) - S2ST: Image-to-Image Translation in the Seed Space of Latent Diffusion [23.142097481682306]
We introduce S2ST, a novel framework designed to accomplish global I2IT in complex images.
S2ST operates within the seed space of a Latent Diffusion Model, thereby leveraging the powerful image priors learned by the latter.
We show that S2ST surpasses state-of-the-art GAN-based I2IT methods, as well as diffusion-based approaches, for complex automotive scenes.
arXiv Detail & Related papers (2023-11-30T18:59:49Z) - SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial
Network for an end-to-end image translation [18.93434486338439]
SCONE-GAN is shown to be effective for learning to generate realistic and diverse scenery images.
For more realistic and diverse image generation we introduce style reference image.
We validate the proposed algorithm for image-to-image translation and stylizing outdoor images.
arXiv Detail & Related papers (2023-11-07T10:29:16Z) - Guided Image-to-Image Translation by Discriminator-Generator
Communication [71.86347329356244]
The goal of Image-to-image (I2I) translation is to transfer an image from a source domain to a target domain.
One major branch of this research is to formulate I2I translation based on Generative Adversarial Network (GAN)
arXiv Detail & Related papers (2023-03-07T02:29:36Z) - StyTr^2: Unbiased Image Style Transfer with Transformers [59.34108877969477]
The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.
Traditional neural style transfer methods are usually biased and content leak can be observed by running several times of the style transfer process with the same reference image.
We propose a transformer-based approach, namely StyTr2, to address this critical issue.
arXiv Detail & Related papers (2021-05-30T15:57:09Z) - Diverse Image Inpainting with Bidirectional and Autoregressive
Transformers [55.21000775547243]
We propose BAT-Fill, an image inpainting framework with a novel bidirectional autoregressive transformer (BAT)
BAT-Fill inherits the merits of transformers and CNNs in a two-stage manner, which allows to generate high-resolution contents without being constrained by the quadratic complexity of attention in transformers.
arXiv Detail & Related papers (2021-04-26T03:52:27Z) - Progressive and Aligned Pose Attention Transfer for Person Image
Generation [59.87492938953545]
This paper proposes a new generative adversarial network for pose transfer, i.e., transferring the pose of a given person to a target pose.
We use two types of blocks, namely Pose-Attentional Transfer Block (PATB) and Aligned Pose-Attentional Transfer Bloc (APATB)
We verify the efficacy of the model on the Market-1501 and DeepFashion datasets, using quantitative and qualitative measures.
arXiv Detail & Related papers (2021-03-22T07:24:57Z) - DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by
Transferring from GANs [43.33066765114446]
Image-to-image translation suffers from inferior performance when translations between classes require large shape changes.
We propose a novel deep hierarchical Image-to-Image Translation method, called DeepI2I.
We demonstrate that transfer learning significantly improves the performance of I2I systems, especially for small datasets.
arXiv Detail & Related papers (2020-11-11T16:03:03Z) - Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2
Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations.
By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z) - XingGAN for Person Image Generation [149.54517767056382]
We propose a novel Generative Adversarial Network (XingGAN) for person image generation tasks.
XingGAN consists of two generation branches that model the person's appearance and shape information.
We show that the proposed XingGAN advances the state-of-the-art performance in terms of objective quantitative scores and subjective visual realness.
arXiv Detail & Related papers (2020-07-17T23:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.