High-Fidelity Neural Human Motion Transfer from Monocular Video
- URL: http://arxiv.org/abs/2012.10974v1
- Date: Sun, 20 Dec 2020 16:54:38 GMT
- Title: High-Fidelity Neural Human Motion Transfer from Monocular Video
- Authors: Moritz Kappel and Vladislav Golyanik and Mohamed Elgharib and Jann-Ole
Henningson and Hans-Peter Seidel and Susana Castillo and Christian Theobalt
and Marcus Magnor
- Abstract summary: Video-based human motion transfer creates video animations of humans following a source motion.
We present a new framework which performs high-fidelity and temporally-consistent human motion transfer with natural pose-dependent non-rigid deformations.
In the experimental results, we significantly outperform the state-of-the-art in terms of video realism.
- Score: 71.75576402562247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video-based human motion transfer creates video animations of humans
following a source motion. Current methods show remarkable results for
tightly-clad subjects. However, the lack of temporally consistent handling of
plausible clothing dynamics, including fine and high-frequency details,
significantly limits the attainable visual quality. We address these
limitations for the first time in the literature and present a new framework
which performs high-fidelity and temporally-consistent human motion transfer
with natural pose-dependent non-rigid deformations, for several types of loose
garments. In contrast to the previous techniques, we perform image generation
in three subsequent stages, synthesizing human shape, structure, and
appearance. Given a monocular RGB video of an actor, we train a stack of
recurrent deep neural networks that generate these intermediate representations
from 2D poses and their temporal derivatives. Splitting the difficult motion
transfer problem into subtasks that are aware of the temporal motion context
helps us to synthesize results with plausible dynamics and pose-dependent
detail. It also allows artistic control of results by manipulation of
individual framework stages. In the experimental results, we significantly
outperform the state-of-the-art in terms of video realism. Our code and data
will be made publicly available.
Related papers
- Do As I Do: Pose Guided Human Motion Copy [39.40271266234068]
Motion copy is an intriguing yet challenging task in artificial intelligence and computer vision.
Existing approaches typically adopt a conventional GAN with an L1 or L2 loss to produce the target fake video.
We present an episodic memory module in the pose-to-appearance generation to propel continuous learning.
Our method significantly outperforms state-of-the-art approaches and gains 7.2% and 12.4% improvements in PSNR and FID respectively.
arXiv Detail & Related papers (2024-06-24T12:41:51Z) - MotionBERT: A Unified Perspective on Learning Human Motion
Representations [46.67364057245364]
We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.
We propose a pretraining stage in which a motion encoder is trained to recover the underlying 3D motion from noisy partial 2D observations.
We implement motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer) neural network.
arXiv Detail & Related papers (2022-10-12T19:46:25Z) - Learning Motion-Dependent Appearance for High-Fidelity Rendering of
Dynamic Humans from a Single Camera [49.357174195542854]
A key challenge of learning the dynamics of the appearance lies in the requirement of a prohibitively large amount of observations.
We show that our method can generate a temporally coherent video of dynamic humans for unseen body poses and novel views given a single view video.
arXiv Detail & Related papers (2022-03-24T00:22:03Z) - Dance In the Wild: Monocular Human Animation with Neural Dynamic
Appearance Synthesis [56.550999933048075]
We propose a video based synthesis method that tackles challenges and demonstrates high quality results for in-the-wild videos.
We introduce a novel motion signature that is used to modulate the generator weights to capture dynamic appearance changes.
We evaluate our method on a set of challenging videos and show that our approach achieves state-of-the art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-11-10T20:18:57Z) - Render In-between: Motion Guided Video Synthesis for Action
Interpolation [53.43607872972194]
We propose a motion-guided frame-upsampling framework that is capable of producing realistic human motion and appearance.
A novel motion model is trained to inference the non-linear skeletal motion between frames by leveraging a large-scale motion-capture dataset.
Our pipeline only requires low-frame-rate videos and unpaired human motion data but does not require high-frame-rate videos for training.
arXiv Detail & Related papers (2021-11-01T15:32:51Z) - Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes [27.443701512923177]
We propose to bridge human motion synthesis and scene affordance reasoning.
We present a hierarchical generative framework to synthesize long-term 3D human motion conditioning on the 3D scene structure.
Our experiments show significant improvements over previous approaches on generating natural and physically plausible human motion in a scene.
arXiv Detail & Related papers (2020-12-10T09:09:38Z) - Pose-Guided Human Animation from a Single Image in the Wild [83.86903892201656]
We present a new pose transfer method for synthesizing a human animation from a single image of a person controlled by a sequence of body poses.
Existing pose transfer methods exhibit significant visual artifacts when applying to a novel scene.
We design a compositional neural network that predicts the silhouette, garment labels, and textures.
We are able to synthesize human animations that can preserve the identity and appearance of the person in a temporally coherent way without any fine-tuning of the network on the testing scene.
arXiv Detail & Related papers (2020-12-07T15:38:29Z) - Human Motion Transfer from Poses in the Wild [61.6016458288803]
We tackle the problem of human motion transfer, where we synthesize novel motion video for a target person that imitates the movement from a reference video.
It is a video-to-video translation task in which the estimated poses are used to bridge two domains.
We introduce a novel pose-to-video translation framework for generating high-quality videos that are temporally coherent even for in-the-wild pose sequences unseen during training.
arXiv Detail & Related papers (2020-04-07T05:59:53Z) - Do As I Do: Transferring Human Motion and Appearance between Monocular
Videos with Spatial and Temporal Constraints [8.784162652042959]
Marker-less human motion estimation and shape modeling from images in the wild bring this challenge to the fore.
We propose a unifying formulation for transferring appearance and human motion from monocular videos.
Our method is able to transfer both human motion and appearance outperforming state-of-the-art methods.
arXiv Detail & Related papers (2020-01-08T16:39:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.