Bidirectionally Deformable Motion Modulation For Video-based Human Pose
Transfer
- URL: http://arxiv.org/abs/2307.07754v2
- Date: Tue, 18 Jul 2023 10:36:30 GMT
- Title: Bidirectionally Deformable Motion Modulation For Video-based Human Pose
Transfer
- Authors: Wing-Yin Yu, Lai-Man Po, Ray C.C. Cheung, Yuzhi Zhao, Yu Xue, Kun Li
- Abstract summary: Video-based human pose transfer is a video-to-video generation task that animates a plain source human image based on a series of target human poses.
We propose a novel Deformable Motion Modulation (DMM) that utilizes geometric kernel offset with adaptive weight modulation to simultaneously perform discontinuous feature alignment and style transfer.
- Score: 19.5025303182983
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Video-based human pose transfer is a video-to-video generation task that
animates a plain source human image based on a series of target human poses.
Considering the difficulties in transferring highly structural patterns on the
garments and discontinuous poses, existing methods often generate
unsatisfactory results such as distorted textures and flickering artifacts. To
address these issues, we propose a novel Deformable Motion Modulation (DMM)
that utilizes geometric kernel offset with adaptive weight modulation to
simultaneously perform feature alignment and style transfer. Different from
normal style modulation used in style transfer, the proposed modulation
mechanism adaptively reconstructs smoothed frames from style codes according to
the object shape through an irregular receptive field of view. To enhance the
spatio-temporal consistency, we leverage bidirectional propagation to extract
the hidden motion information from a warped image sequence generated by noisy
poses. The proposed feature propagation significantly enhances the motion
prediction ability by forward and backward propagation. Both quantitative and
qualitative experimental results demonstrate superiority over the
state-of-the-arts in terms of image fidelity and visual continuity. The source
code is publicly available at github.com/rocketappslab/bdmm.
Related papers
- MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion [94.66090422753126]
MotionFollower is a lightweight score-guided diffusion model for video motion editing.
It delivers superior motion editing performance and exclusively supports large camera movements and actions.
Compared with MotionEditor, the most advanced motion editing model, MotionFollower achieves an approximately 80% reduction in GPU memory.
arXiv Detail & Related papers (2024-05-30T17:57:30Z) - Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing [46.56615725175025]
We introduce Edit-Your-Motion, a video motion editing method that tackles unseen challenges through one-shot fine-tuning.
To effectively decouple motion and appearance of source video, we design atemporal-two-stage learning strategy.
With Edit-Your-Motion, users can edit the motion of humans in the source video, creating more engaging and diverse content.
arXiv Detail & Related papers (2024-05-07T17:06:59Z) - Motion Guidance: Diffusion-Based Image Editing with Differentiable
Motion Estimators [19.853978560075305]
Motion guidance is a technique that allows a user to specify dense, complex motion fields that indicate where each pixel in an image should move.
We demonstrate that our technique works on complex motions and produces high quality edits of real and generated images.
arXiv Detail & Related papers (2024-01-31T18:59:59Z) - MotionCrafter: One-Shot Motion Customization of Diffusion Models [66.44642854791807]
We introduce MotionCrafter, a one-shot instance-guided motion customization method.
MotionCrafter employs a parallel spatial-temporal architecture that injects the reference motion into the temporal component of the base model.
During training, a frozen base model provides appearance normalization, effectively separating appearance from motion.
arXiv Detail & Related papers (2023-12-08T16:31:04Z) - VMC: Video Motion Customization using Temporal Attention Adaption for
Text-to-Video Diffusion Models [58.93124686141781]
Video Motion Customization (VMC) is a novel one-shot tuning approach crafted to adapt temporal attention layers within video diffusion models.
Our approach introduces a novel motion distillation objective using residual vectors between consecutive frames as a motion reference.
We validate our method against state-of-the-art video generative models across diverse real-world motions and contexts.
arXiv Detail & Related papers (2023-12-01T06:50:11Z) - MotionEditor: Editing Video Motion via Content-Aware Diffusion [96.825431998349]
MotionEditor is a diffusion model for video motion editing.
It incorporates a novel content-aware motion adapter into ControlNet to capture temporal motion correspondence.
arXiv Detail & Related papers (2023-11-30T18:59:33Z) - High-Fidelity and Freely Controllable Talking Head Video Generation [31.08828907637289]
We propose a novel model that produces high-fidelity talking head videos with free control over head pose and expression.
We introduce a novel motion-aware multi-scale feature alignment module to effectively transfer the motion without face distortion.
We evaluate our model on challenging datasets and demonstrate its state-of-the-art performance.
arXiv Detail & Related papers (2023-04-20T09:02:41Z) - Motion and Appearance Adaptation for Cross-Domain Motion Transfer [36.98500700394921]
Motion transfer aims to transfer the motion of a driving video to a source image.
Traditional single domain motion transfer approaches often produce notable artifacts.
We propose a Motion and Appearance Adaptation (MAA) approach for cross-domain motion transfer.
arXiv Detail & Related papers (2022-09-29T03:24:47Z) - Learning Motion-Dependent Appearance for High-Fidelity Rendering of
Dynamic Humans from a Single Camera [49.357174195542854]
A key challenge of learning the dynamics of the appearance lies in the requirement of a prohibitively large amount of observations.
We show that our method can generate a temporally coherent video of dynamic humans for unseen body poses and novel views given a single view video.
arXiv Detail & Related papers (2022-03-24T00:22:03Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - Dual-MTGAN: Stochastic and Deterministic Motion Transfer for
Image-to-Video Synthesis [38.41763708731513]
We propose Dual Motion Transfer GAN (Dual-MTGAN), which takes image and video data as inputs while learning disentangled content and motion representations.
Our Dual-MTGAN is able to perform deterministic motion transfer and motion generation.
The proposed model is trained in an end-to-end manner, without the need to utilize pre-defined motion features like pose or facial landmarks.
arXiv Detail & Related papers (2021-02-26T06:54:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.