Related papers: Learning Fine-Grained Motion Embedding for Landscape Animation

Learning Fine-Grained Motion Embedding for Landscape Animation

URL: http://arxiv.org/abs/2109.02216v1
Date: Mon, 6 Sep 2021 02:47:11 GMT
Title: Learning Fine-Grained Motion Embedding for Landscape Animation
Authors: Hongwei Xue, Bei Liu, Huan Yang, Jianlong Fu, Houqiang Li, Jiebo Luo
Abstract summary: We propose a model named FGLA to generate high-quality and realistic videos by learning Fine-Grained motion embedding. To train and evaluate on diverse time-lapse videos, we build the largest high-resolution Time-lapse video dataset with Diverse scenes. Our method achieves relative improvements by 19% on LIPIS and 5.6% on FVD compared with state-of-the-art methods on our dataset.
Score: 140.57889994591494
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we focus on landscape animation, which aims to generate time-lapse videos from a single landscape image. Motion is crucial for landscape animation as it determines how objects move in videos. Existing methods are able to generate appealing videos by learning motion from real time-lapse videos. However, current methods suffer from inaccurate motion generation, which leads to unrealistic video results. To tackle this problem, we propose a model named FGLA to generate high-quality and realistic videos by learning Fine-Grained motion embedding for Landscape Animation. Our model consists of two parts: (1) a motion encoder which embeds time-lapse motion in a fine-grained way. (2) a motion generator which generates realistic motion to animate input images. To train and evaluate on diverse time-lapse videos, we build the largest high-resolution Time-lapse video dataset with Diverse scenes, namely Time-lapse-D, which includes 16,874 video clips with over 10 million frames. Quantitative and qualitative experimental results demonstrate the superiority of our method. In particular, our method achieves relative improvements by 19% on LIPIS and 5.6% on FVD compared with state-of-the-art methods on our dataset. A user study carried out with 700 human subjects shows that our approach visually outperforms existing methods by a large margin.

Related papers

Learning to Animate Images from A Few Videos to Portray Delicate Human Actions [80.61838364885482]
Video generative models still struggle to animate static images into videos that portray delicate human actions. In this paper, we explore the task of learning to animate images to portray delicate human actions using a small number of videos. We propose FLASH, which learns generalizable motion patterns by forcing the model to reconstruct a video using the motion features and cross-frame correspondences of another video.
arXiv Detail & Related papers (2025-03-01T01:09:45Z)
Motion Prompting: Controlling Video Generation with Motion Trajectories [57.049252242807874]
We train a video generation model conditioned on sparse or dense video trajectories. We translate high-level user requests into detailed, semi-dense motion prompts. We demonstrate our approach through various applications, including camera and object motion control, "interacting" with an image, motion transfer, and image editing.
arXiv Detail & Related papers (2024-12-03T18:59:56Z)
Controllable Longer Image Animation with Diffusion Models [12.565739255499594]
We introduce an open-domain controllable image animation method using motion priors with video diffusion models. Our method achieves precise control over the direction and speed of motion in the movable region by extracting the motion field information from videos. We propose an efficient long-duration video generation method based on noise reschedule specifically tailored for image animation tasks.
arXiv Detail & Related papers (2024-05-27T16:08:00Z)
AniClipart: Clipart Animation with Text-to-Video Priors [28.76809141136148]
We introduce AniClipart, a system that transforms static images into high-quality motion sequences guided by text-to-video priors. Experimental results show that the proposed AniClipart consistently outperforms existing image-to-video generation models.
arXiv Detail & Related papers (2024-04-18T17:24:28Z)
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models [58.93124686141781]
Video Motion Customization (VMC) is a novel one-shot tuning approach crafted to adapt temporal attention layers within video diffusion models. Our approach introduces a novel motion distillation objective using residual vectors between consecutive frames as a motion reference. We validate our method against state-of-the-art video generative models across diverse real-world motions and contexts.
arXiv Detail & Related papers (2023-12-01T06:50:11Z)
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model [74.84435399451573]
This paper studies the human image animation task, which aims to generate a video of a certain reference identity following a particular motion sequence. Existing animation works typically employ the frame-warping technique to animate the reference image towards the target motion. We introduce MagicAnimate, a diffusion-based framework that aims at enhancing temporal consistency, preserving reference image faithfully, and improving animation fidelity.
arXiv Detail & Related papers (2023-11-27T18:32:31Z)
AnimateAnything: Fine-Grained Open Domain Image Animation with Motion Guidance [13.416296247896042]
We introduce an open domain image animation method that leverages the motion prior of video diffusion model. Our approach introduces targeted motion area guidance and motion strength guidance, enabling precise control of the movable area and its motion speed. We validate the effectiveness of our method through rigorous experiments on an open-domain dataset.
arXiv Detail & Related papers (2023-11-21T03:47:54Z)
Make-It-4D: Synthesizing a Consistent Long-Term Dynamic Scene Video from a Single Image [59.18564636990079]
We study the problem of synthesizing a long-term dynamic video from only a single image. Existing methods either hallucinate inconsistent perpetual views or struggle with long camera trajectories. We present Make-It-4D, a novel method that can generate a consistent long-term dynamic video from a single image.
arXiv Detail & Related papers (2023-08-20T12:53:50Z)
Deep Animation Video Interpolation in the Wild [115.24454577119432]
In this work, we formally define and study the animation video code problem for the first time. We propose an effective framework, AnimeInterp, with two dedicated modules in a coarse-to-fine manner. Notably, AnimeInterp shows favorable perceptual quality and robustness for animation scenarios in the wild.
arXiv Detail & Related papers (2021-04-06T13:26:49Z)
Animating Pictures with Eulerian Motion Fields [90.30598913855216]
We show a fully automatic method for converting a still image into a realistic animated looping video. We target scenes with continuous fluid motion, such as flowing water and billowing smoke. We propose a novel video looping technique that flows features both forward and backward in time and then blends the results.
arXiv Detail & Related papers (2020-11-30T18:59:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.