Dual-MTGAN: Stochastic and Deterministic Motion Transfer for
Image-to-Video Synthesis
- URL: http://arxiv.org/abs/2102.13329v1
- Date: Fri, 26 Feb 2021 06:54:48 GMT
- Title: Dual-MTGAN: Stochastic and Deterministic Motion Transfer for
Image-to-Video Synthesis
- Authors: Fu-En Yang, Jing-Cheng Chang, Yuan-Hao Lee, Yu-Chiang Frank Wang
- Abstract summary: We propose Dual Motion Transfer GAN (Dual-MTGAN), which takes image and video data as inputs while learning disentangled content and motion representations.
Our Dual-MTGAN is able to perform deterministic motion transfer and motion generation.
The proposed model is trained in an end-to-end manner, without the need to utilize pre-defined motion features like pose or facial landmarks.
- Score: 38.41763708731513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating videos with content and motion variations is a challenging task in
computer vision. While the recent development of GAN allows video generation
from latent representations, it is not easy to produce videos with particular
content of motion patterns of interest. In this paper, we propose Dual Motion
Transfer GAN (Dual-MTGAN), which takes image and video data as inputs while
learning disentangled content and motion representations. Our Dual-MTGAN is
able to perform deterministic motion transfer and stochastic motion generation.
Based on a given image, the former preserves the input content and transfers
motion patterns observed from another video sequence, and the latter directly
produces videos with plausible yet diverse motion patterns based on the input
image. The proposed model is trained in an end-to-end manner, without the need
to utilize pre-defined motion features like pose or facial landmarks. Our
quantitative and qualitative results would confirm the effectiveness and
robustness of our model in addressing such conditioned image-to-video tasks.
Related papers
- Animate Your Motion: Turning Still Images into Dynamic Videos [58.63109848837741]
We introduce Scene and Motion Conditional Diffusion (SMCD), a novel methodology for managing multimodal inputs.
SMCD incorporates a recognized motion conditioning module and investigates various approaches to integrate scene conditions.
Our design significantly enhances video quality, motion precision, and semantic coherence.
arXiv Detail & Related papers (2024-03-15T10:36:24Z) - VMC: Video Motion Customization using Temporal Attention Adaption for
Text-to-Video Diffusion Models [58.93124686141781]
Video Motion Customization (VMC) is a novel one-shot tuning approach crafted to adapt temporal attention layers within video diffusion models.
Our approach introduces a novel motion distillation objective using residual vectors between consecutive frames as a motion reference.
We validate our method against state-of-the-art video generative models across diverse real-world motions and contexts.
arXiv Detail & Related papers (2023-12-01T06:50:11Z) - Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback Learning [50.60891619269651]
Control-A-Video is a controllable T2V diffusion model that can generate videos conditioned on text prompts and reference control maps like edge and depth maps.
We propose novel strategies to incorporate content prior and motion prior into the diffusion-based generation process.
Our framework generates higher-quality, more consistent videos compared to existing state-of-the-art methods in controllable text-to-video generation.
arXiv Detail & Related papers (2023-05-23T09:03:19Z) - LaMD: Latent Motion Diffusion for Video Generation [69.4111397077229]
latent motion diffusion (LaMD) framework consists of a motion-decomposed video autoencoder and a diffusion-based motion generator.
Results show that LaMD generates high-quality videos with a wide range of motions, from dynamics to highly controllable movements.
arXiv Detail & Related papers (2023-04-23T10:32:32Z) - Learning Variational Motion Prior for Video-based Motion Capture [31.79649766268877]
We present a novel variational motion prior (VMP) learning approach for video-based motion capture.
Our framework can effectively reduce temporal jittering and failure modes in frame-wise pose estimation.
Experiments over both public datasets and in-the-wild videos have demonstrated the efficacy and generalization capability of our framework.
arXiv Detail & Related papers (2022-10-27T02:45:48Z) - Motion and Appearance Adaptation for Cross-Domain Motion Transfer [36.98500700394921]
Motion transfer aims to transfer the motion of a driving video to a source image.
Traditional single domain motion transfer approaches often produce notable artifacts.
We propose a Motion and Appearance Adaptation (MAA) approach for cross-domain motion transfer.
arXiv Detail & Related papers (2022-09-29T03:24:47Z) - Continuous-Time Video Generation via Learning Motion Dynamics with
Neural ODE [26.13198266911874]
We propose a novel video generation approach that learns separate distributions for motion and appearance.
We employ a two-stage approach where the first stage converts a noise vector to a sequence of keypoints in arbitrary frame rates, and the second stage synthesizes videos based on the given keypoints sequence and the appearance noise vector.
arXiv Detail & Related papers (2021-12-21T03:30:38Z) - Make It Move: Controllable Image-to-Video Generation with Text
Descriptions [69.52360725356601]
TI2V task aims at generating videos from a static image and a text description.
To address these challenges, we propose a Motion Anchor-based video GEnerator (MAGE) with an innovative motion anchor structure.
Experiments conducted on datasets verify the effectiveness of MAGE and show appealing potentials of TI2V task.
arXiv Detail & Related papers (2021-12-06T07:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.