Liquid Warping GAN with Attention: A Unified Framework for Human Image
Synthesis
- URL: http://arxiv.org/abs/2011.09055v2
- Date: Mon, 23 Nov 2020 04:50:43 GMT
- Title: Liquid Warping GAN with Attention: A Unified Framework for Human Image
Synthesis
- Authors: Wen Liu, Zhixin Piao, Zhi Tu, Wenhan Luo, Lin Ma and Shenghua Gao
- Abstract summary: We tackle human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis.
In this paper, we propose a 3D body mesh recovery module to disentangle the pose and shape.
We also build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.
- Score: 58.05389586712485
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We tackle human image synthesis, including human motion imitation, appearance
transfer, and novel view synthesis, within a unified framework. It means that
the model, once being trained, can be used to handle all these tasks. The
existing task-specific methods mainly use 2D keypoints to estimate the human
body structure. However, they only express the position information with no
abilities to characterize the personalized shape of the person and model the
limb rotations. In this paper, we propose to use a 3D body mesh recovery module
to disentangle the pose and shape. It can not only model the joint location and
rotation but also characterize the personalized body shape. To preserve the
source information, such as texture, style, color, and face identity, we
propose an Attentional Liquid Warping GAN with Attentional Liquid Warping Block
(AttLWB) that propagates the source information in both image and feature
spaces to the synthesized reference. Specifically, the source features are
extracted by a denoising convolutional auto-encoder for characterizing the
source identity well. Furthermore, our proposed method can support a more
flexible warping from multiple sources. To further improve the generalization
ability of the unseen source images, a one/few-shot adversarial learning is
applied. In detail, it firstly trains a model in an extensive training set.
Then, it finetunes the model by one/few-shot unseen image(s) in a
self-supervised way to generate high-resolution (512 x 512 and 1024 x 1024)
results. Also, we build a new dataset, namely iPER dataset, for the evaluation
of human motion imitation, appearance transfer, and novel view synthesis.
Extensive experiments demonstrate the effectiveness of our methods in terms of
preserving face identity, shape consistency, and clothes details. All codes and
dataset are available on
https://impersonator.org/work/impersonator-plus-plus.html.
Related papers
- One-shot Implicit Animatable Avatars with Model-based Priors [31.385051428938585]
ELICIT is a novel method for learning human-specific neural radiance fields from a single image.
ELICIT has outperformed strong baseline methods of avatar creation when only a single image is available.
arXiv Detail & Related papers (2022-12-05T18:24:06Z) - Pose Guided Human Image Synthesis with Partially Decoupled GAN [25.800174118151638]
Pose Guided Human Image Synthesis (PGHIS) is a challenging task of transforming a human image from the reference pose to a target pose.
We propose a method by decoupling the human body into several parts to guide the synthesis of a realistic image of the person.
In addition, we design a multi-head attention-based module for PGHIS.
arXiv Detail & Related papers (2022-10-07T15:31:37Z) - NeuralReshaper: Single-image Human-body Retouching with Deep Neural
Networks [50.40798258968408]
We present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks.
Our approach follows a fit-then-reshape pipeline, which first fits a parametric 3D human model to a source human image.
To deal with the lack-of-data problem that no paired data exist, we introduce a novel self-supervised strategy to train our network.
arXiv Detail & Related papers (2022-03-20T09:02:13Z) - Image Comes Dancing with Collaborative Parsing-Flow Video Synthesis [124.48519390371636]
Transfering human motion from a source to a target person poses great potential in computer vision and graphics applications.
Previous work has either relied on crafted 3D human models or trained a separate model specifically for each target person.
This work studies a more general setting, in which we aim to learn a single model to parsimoniously transfer motion from a source video to any target person.
arXiv Detail & Related papers (2021-10-27T03:42:41Z) - Creating and Reenacting Controllable 3D Humans with Differentiable
Rendering [3.079885946230076]
This paper proposes a new end-to-end neural rendering architecture to transfer appearance and reenact human actors.
Our method leverages a carefully designed graph convolutional network (GCN) to model the human body manifold structure.
By taking advantages of both different synthesisiable rendering and the 3D parametric model, our method is fully controllable.
arXiv Detail & Related papers (2021-10-22T12:40:09Z) - Neural-GIF: Neural Generalized Implicit Functions for Animating People
in Clothing [49.32522765356914]
We learn to animate people in clothing as a function of the body pose.
We learn to map every point in the space to a canonical space, where a learned deformation field is applied to model non-rigid effects.
Neural-GIF can be trained on raw 3D scans and reconstructs detailed complex surface geometry and deformations.
arXiv Detail & Related papers (2021-08-19T17:25:16Z) - Detailed Avatar Recovery from Single Image [50.82102098057822]
This paper presents a novel framework to recover emphdetailed avatar from a single image.
We use the deep neural networks to refine the 3D shape in a Hierarchical Mesh Deformation framework.
Our method can restore detailed human body shapes with complete textures beyond skinned models.
arXiv Detail & Related papers (2021-08-06T03:51:26Z) - Combining Implicit Function Learning and Parametric Models for 3D Human
Reconstruction [123.62341095156611]
Implicit functions represented as deep learning approximations are powerful for reconstructing 3D surfaces.
Such features are essential in building flexible models for both computer graphics and computer vision.
We present methodology that combines detail-rich implicit functions and parametric representations.
arXiv Detail & Related papers (2020-07-22T13:46:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.