Deep Spatial Transformation for Pose-Guided Person Image Generation and
Animation
- URL: http://arxiv.org/abs/2008.12606v1
- Date: Thu, 27 Aug 2020 08:59:44 GMT
- Title: Deep Spatial Transformation for Pose-Guided Person Image Generation and
Animation
- Authors: Yurui Ren and Ge Li and Shan Liu and Thomas H. Li
- Abstract summary: Pose-guided person image generation and animation aim to transform a source person image to target poses.
Convolutional Neural Networks are limited by the lack of ability to spatially transform the inputs.
We propose a differentiable global-flow local-attention framework to reassemble the inputs at the feature level.
- Score: 50.10989443332995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pose-guided person image generation and animation aim to transform a source
person image to target poses. These tasks require spatial manipulation of
source data. However, Convolutional Neural Networks are limited by the lack of
ability to spatially transform the inputs. In this paper, we propose a
differentiable global-flow local-attention framework to reassemble the inputs
at the feature level. This framework first estimates global flow fields between
sources and targets. Then, corresponding local source feature patches are
sampled with content-aware local attention coefficients. We show that our
framework can spatially transform the inputs in an efficient manner. Meanwhile,
we further model the temporal consistency for the person image animation task
to generate coherent videos. The experiment results of both image generation
and animation tasks demonstrate the superiority of our model. Besides,
additional results of novel view synthesis and face image animation show that
our model is applicable to other tasks requiring spatial transformation. The
source code of our project is available at
https://github.com/RenYurui/Global-Flow-Local-Attention.
Related papers
- Free-viewpoint Human Animation with Pose-correlated Reference Selection [31.429327964922184]
Diffusion-based human animation aims to animate a human character based on a source human image as well as driving signals such as a sequence of poses.
Existing approaches are able to generate high-fidelity poses, but struggle with significant viewpoint changes.
We propose a pose-correlated reference selection diffusion network, supporting substantial viewpoint variations in human animation.
arXiv Detail & Related papers (2024-12-23T05:22:44Z) - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation [27.700371215886683]
diffusion models have become the mainstream in visual generation research, owing to their robust generative capabilities.
In this paper, we propose a novel framework tailored for character animation.
By expanding the training data, our approach can animate arbitrary characters, yielding superior results in character animation compared to other image-to-video methods.
arXiv Detail & Related papers (2023-11-28T12:27:15Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Neural Novel Actor: Learning a Generalized Animatable Neural
Representation for Human Actors [98.24047528960406]
We propose a new method for learning a generalized animatable neural representation from a sparse set of multi-view imagery of multiple persons.
The learned representation can be used to synthesize novel view images of an arbitrary person from a sparse set of cameras, and further animate them with the user's pose control.
arXiv Detail & Related papers (2022-08-25T07:36:46Z) - Image Comes Dancing with Collaborative Parsing-Flow Video Synthesis [124.48519390371636]
Transfering human motion from a source to a target person poses great potential in computer vision and graphics applications.
Previous work has either relied on crafted 3D human models or trained a separate model specifically for each target person.
This work studies a more general setting, in which we aim to learn a single model to parsimoniously transfer motion from a source video to any target person.
arXiv Detail & Related papers (2021-10-27T03:42:41Z) - Liquid Warping GAN with Attention: A Unified Framework for Human Image
Synthesis [58.05389586712485]
We tackle human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis.
In this paper, we propose a 3D body mesh recovery module to disentangle the pose and shape.
We also build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.
arXiv Detail & Related papers (2020-11-18T02:57:47Z) - Deep Image Spatial Transformation for Person Image Generation [31.966927317737873]
We propose a differentiable global-flow local-attention framework to reassemble the inputs at the feature level.
Our model first calculates the global correlations between sources and targets to predict flow fields.
We warp the source features using a content-aware sampling method with the obtained local attention coefficients.
arXiv Detail & Related papers (2020-03-02T07:31:00Z) - First Order Motion Model for Image Animation [90.712718329677]
Image animation consists of generating a video sequence so that an object in a source image is animated according to the motion of a driving video.
Our framework addresses this problem without using any annotation or prior information about the specific object to animate.
arXiv Detail & Related papers (2020-02-29T07:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.