Dual-MTGAN: Stochastic and Deterministic Motion Transfer for
Image-to-Video Synthesis
- URL: http://arxiv.org/abs/2102.13329v1
- Date: Fri, 26 Feb 2021 06:54:48 GMT
- Title: Dual-MTGAN: Stochastic and Deterministic Motion Transfer for
Image-to-Video Synthesis
- Authors: Fu-En Yang, Jing-Cheng Chang, Yuan-Hao Lee, Yu-Chiang Frank Wang
- Abstract summary: We propose Dual Motion Transfer GAN (Dual-MTGAN), which takes image and video data as inputs while learning disentangled content and motion representations.
Our Dual-MTGAN is able to perform deterministic motion transfer and motion generation.
The proposed model is trained in an end-to-end manner, without the need to utilize pre-defined motion features like pose or facial landmarks.
- Score: 38.41763708731513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating videos with content and motion variations is a challenging task in
computer vision. While the recent development of GAN allows video generation
from latent representations, it is not easy to produce videos with particular
content of motion patterns of interest. In this paper, we propose Dual Motion
Transfer GAN (Dual-MTGAN), which takes image and video data as inputs while
learning disentangled content and motion representations. Our Dual-MTGAN is
able to perform deterministic motion transfer and stochastic motion generation.
Based on a given image, the former preserves the input content and transfers
motion patterns observed from another video sequence, and the latter directly
produces videos with plausible yet diverse motion patterns based on the input
image. The proposed model is trained in an end-to-end manner, without the need
to utilize pre-defined motion features like pose or facial landmarks. Our
quantitative and qualitative results would confirm the effectiveness and
robustness of our model in addressing such conditioned image-to-video tasks.
Related papers
- MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching [27.28898943916193]
Text-to-video (T2V) diffusion models have promising capabilities in synthesizing realistic videos from input text prompts.
In this work, we tackle the motion customization problem, where a reference video is provided as motion guidance.
We propose MotionMatcher, a motion customization framework that fine-tunes the pre-trained T2V diffusion model at the feature level.
arXiv Detail & Related papers (2025-02-18T19:12:51Z) - MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation [65.74312406211213]
This paper presents a method that allows users to design cinematic video shots in the context of image-to-video generation.
By connecting insights from classical computer graphics and contemporary video generation techniques, we demonstrate the ability to achieve 3D-aware motion control in I2V synthesis.
arXiv Detail & Related papers (2025-02-06T18:41:04Z) - VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models [71.9811050853964]
VideoJAM is a novel framework that instills an effective motion prior to video generators.
VideoJAM achieves state-of-the-art performance in motion coherence.
These findings emphasize that appearance and motion can be complementary and, when effectively integrated, enhance both the visual quality and the coherence of video generation.
arXiv Detail & Related papers (2025-02-04T17:07:10Z) - MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models [59.10171699717122]
MoTrans is a customized motion transfer method enabling video generation of similar motion in new context.
multimodal representations from recaptioned prompt and video frames promote the modeling of appearance.
Our method effectively learns specific motion pattern from singular or multiple reference videos.
arXiv Detail & Related papers (2024-12-02T10:07:59Z) - Animate Your Motion: Turning Still Images into Dynamic Videos [58.63109848837741]
We introduce Scene and Motion Conditional Diffusion (SMCD), a novel methodology for managing multimodal inputs.
SMCD incorporates a recognized motion conditioning module and investigates various approaches to integrate scene conditions.
Our design significantly enhances video quality, motion precision, and semantic coherence.
arXiv Detail & Related papers (2024-03-15T10:36:24Z) - Continuous-Time Video Generation via Learning Motion Dynamics with
Neural ODE [26.13198266911874]
We propose a novel video generation approach that learns separate distributions for motion and appearance.
We employ a two-stage approach where the first stage converts a noise vector to a sequence of keypoints in arbitrary frame rates, and the second stage synthesizes videos based on the given keypoints sequence and the appearance noise vector.
arXiv Detail & Related papers (2021-12-21T03:30:38Z) - Make It Move: Controllable Image-to-Video Generation with Text
Descriptions [69.52360725356601]
TI2V task aims at generating videos from a static image and a text description.
To address these challenges, we propose a Motion Anchor-based video GEnerator (MAGE) with an innovative motion anchor structure.
Experiments conducted on datasets verify the effectiveness of MAGE and show appealing potentials of TI2V task.
arXiv Detail & Related papers (2021-12-06T07:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.