Deep Image Spatial Transformation for Person Image Generation
- URL: http://arxiv.org/abs/2003.00696v2
- Date: Wed, 18 Mar 2020 09:42:02 GMT
- Title: Deep Image Spatial Transformation for Person Image Generation
- Authors: Yurui Ren, Xiaoming Yu, Junming Chen, Thomas H. Li, Ge Li
- Abstract summary: We propose a differentiable global-flow local-attention framework to reassemble the inputs at the feature level.
Our model first calculates the global correlations between sources and targets to predict flow fields.
We warp the source features using a content-aware sampling method with the obtained local attention coefficients.
- Score: 31.966927317737873
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pose-guided person image generation is to transform a source person image to
a target pose. This task requires spatial manipulations of source data.
However, Convolutional Neural Networks are limited by the lack of ability to
spatially transform the inputs. In this paper, we propose a differentiable
global-flow local-attention framework to reassemble the inputs at the feature
level. Specifically, our model first calculates the global correlations between
sources and targets to predict flow fields. Then, the flowed local patch pairs
are extracted from the feature maps to calculate the local attention
coefficients. Finally, we warp the source features using a content-aware
sampling method with the obtained local attention coefficients. The results of
both subjective and objective experiments demonstrate the superiority of our
model. Besides, additional results in video animation and view synthesis show
that our model is applicable to other tasks requiring spatial transformation.
Our source code is available at
https://github.com/RenYurui/Global-Flow-Local-Attention.
Related papers
- Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - TopNet: Transformer-based Object Placement Network for Image Compositing [43.14411954867784]
Local clues in background images are important to determine the compatibility of placing objects with certain locations/scales.
We propose to learn the correlation between object features and all local background features with a transformer module.
Our new formulation generates a 3D heatmap indicating the plausibility of all location/scale combinations in one network forward pass.
arXiv Detail & Related papers (2023-04-06T20:58:49Z) - Global and Local Alignment Networks for Unpaired Image-to-Image
Translation [170.08142745705575]
The goal of unpaired image-to-image translation is to produce an output image reflecting the target domain's style.
Due to the lack of attention to the content change in existing methods, semantic information from source images suffers from degradation during translation.
We introduce a novel approach, Global and Local Alignment Networks (GLA-Net)
Our method effectively generates sharper and more realistic images than existing approaches.
arXiv Detail & Related papers (2021-11-19T18:01:54Z) - Monocular Human Shape and Pose with Dense Mesh-borne Local Image
Features [8.422257363944295]
We propose to improve on graph convolution based approaches for human shape and pose estimation using pixel-aligned local image features.
Our results on standard benchmarks show that using local features improves on global ones and leads to competitive performances with respect to the state-of-the-art.
arXiv Detail & Related papers (2021-11-09T18:43:18Z) - Liquid Warping GAN with Attention: A Unified Framework for Human Image
Synthesis [58.05389586712485]
We tackle human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis.
In this paper, we propose a 3D body mesh recovery module to disentangle the pose and shape.
We also build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.
arXiv Detail & Related papers (2020-11-18T02:57:47Z) - Deep Spatial Transformation for Pose-Guided Person Image Generation and
Animation [50.10989443332995]
Pose-guided person image generation and animation aim to transform a source person image to target poses.
Convolutional Neural Networks are limited by the lack of ability to spatially transform the inputs.
We propose a differentiable global-flow local-attention framework to reassemble the inputs at the feature level.
arXiv Detail & Related papers (2020-08-27T08:59:44Z) - Set Based Stochastic Subsampling [85.5331107565578]
We propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an textitarbitrary downstream task network.
We show that it outperforms the relevant baselines under low subsampling rates on a variety of tasks including image classification, image reconstruction, function reconstruction and few-shot classification.
arXiv Detail & Related papers (2020-06-25T07:36:47Z) - Neural Pose Transfer by Spatially Adaptive Instance Normalization [73.04483812364127]
We propose the first neural pose transfer model that solves the pose transfer via the latest technique for image style transfer.
Our model does not require any correspondences between the source and target meshes.
Experiments show that the proposed model can effectively transfer deformation from source to target meshes, and has good generalization ability to deal with unseen identities or poses of meshes.
arXiv Detail & Related papers (2020-03-16T14:33:59Z) - Unifying Deep Local and Global Features for Image Search [9.614694312155798]
We unify global and local image features into a single deep model, enabling accurate retrieval with efficient feature extraction.
Our model achieves state-of-the-art image retrieval on the Revisited Oxford and Paris datasets, and state-of-the-art single-model instance-level recognition on the Google Landmarks dataset v2.
arXiv Detail & Related papers (2020-01-14T19:59:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.