Related papers: DreamLoop: Controllable Cinemagraph Generation from a Single Photograph

DreamLoop: Controllable Cinemagraph Generation from a Single Photograph

URL: http://arxiv.org/abs/2601.02646v1
Date: Tue, 06 Jan 2026 01:41:40 GMT
Title: DreamLoop: Controllable Cinemagraph Generation from a Single Photograph
Authors: Aniruddha Mahapatra, Long Mai, Cusuh Ham, Feng Liu,
Abstract summary: We present DreamLoop, a controllable video synthesis framework dedicated to generating cinemagraphs from a single photo.<n>Our key idea is to adapt a general video diffusion model by training it on two objectives: temporal bridging and motion conditioning.<n>We demonstrate that our method produces high-quality, complex cinemagraphs that align with user intent, outperforming existing approaches.
Score: 15.908714882662823
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Cinemagraphs, which combine static photographs with selective, looping motion, offer unique artistic appeal. Generating them from a single photograph in a controllable manner is particularly challenging. Existing image-animation techniques are restricted to simple, low-frequency motions and operate only in narrow domains with repetitive textures like water and smoke. In contrast, large-scale video diffusion models are not tailored for cinemagraph constraints and lack the specialized data required to generate seamless, controlled loops. We present DreamLoop, a controllable video synthesis framework dedicated to generating cinemagraphs from a single photo without requiring any cinemagraph training data. Our key idea is to adapt a general video diffusion model by training it on two objectives: temporal bridging and motion conditioning. This strategy enables flexible cinemagraph generation. During inference, by using the input image as both the first- and last- frame condition, we enforce a seamless loop. By conditioning on static tracks, we maintain a static background. Finally, by providing a user-specified motion path for a target object, our method provides intuitive control over the animation's trajectory and timing. To our knowledge, DreamLoop is the first method to enable cinemagraph generation for general scenes with flexible and intuitive controls. We demonstrate that our method produces high-quality, complex cinemagraphs that align with user intent, outperforming existing approaches.

Related papers

Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising [23.044483059783143]
Diffusion-based video generation can create realistic videos, yet existing image- and text-based conditioning fails to offer precise motion control.<n>We introduce Time-to-Move (TTM), a training-free, plug-and-play framework for motion- and appearance-controlled video generation.
arXiv Detail & Related papers (2025-11-09T22:47:50Z)
DreamJourney: Perpetual View Generation with Video Diffusion Models [91.88716097573206]
Perpetual view generation aims to synthesize a long-term video corresponding to an arbitrary camera trajectory solely from a single input image.<n>Recent methods commonly utilize a pre-trained text-to-image diffusion model to synthesize new content of previously unseen regions along camera movement.<n>We present DreamJourney, a two-stage framework that leverages the world simulation capacity of video diffusion models to trigger a new perpetual scene view generation task.
arXiv Detail & Related papers (2025-06-21T12:51:34Z)
Mobius: Text to Seamless Looping Video Generation via Latent Shift [50.04534295458244]
We present Mobius, a novel method to generate seamlessly looping videos from text descriptions directly without any user annotations.<n>Our method repurposes the pre-trained video latent diffusion model for generating looping videos from text prompts without any training.
arXiv Detail & Related papers (2025-02-27T17:33:51Z)
FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors [64.54220123913154]
We introduce FramePainter as an efficient instantiation of image-to-video generation problem.<n>It only uses a lightweight sparse control encoder to inject editing signals.<n>It domainantly outperforms previous state-of-the-art methods with far less training data.
arXiv Detail & Related papers (2025-01-14T16:09:16Z)
Fleximo: Towards Flexible Text-to-Human Motion Video Generation [17.579663311741072]
We introduce a novel task aimed at generating human motion videos solely from reference images and natural language.<n>We propose a new framework called Fleximo, which leverages large-scale pre-trained text-to-3D motion models.<n>To assess the performance of Fleximo, we introduce a new benchmark called MotionBench, which includes 400 videos across 20 identities and 20 motions.
arXiv Detail & Related papers (2024-11-29T04:09:13Z)
Controllable Longer Image Animation with Diffusion Models [12.565739255499594]
We introduce an open-domain controllable image animation method using motion priors with video diffusion models. Our method achieves precise control over the direction and speed of motion in the movable region by extracting the motion field information from videos. We propose an efficient long-duration video generation method based on noise reschedule specifically tailored for image animation tasks.
arXiv Detail & Related papers (2024-05-27T16:08:00Z)
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models [58.93124686141781]
Video Motion Customization (VMC) is a novel one-shot tuning approach crafted to adapt temporal attention layers within video diffusion models. Our approach introduces a novel motion distillation objective using residual vectors between consecutive frames as a motion reference. We validate our method against state-of-the-art video generative models across diverse real-world motions and contexts.
arXiv Detail & Related papers (2023-12-01T06:50:11Z)
WAIT: Feature Warping for Animation to Illustration video Translation using GANs [11.968412857420192]
We introduce a new problem for video stylizing where an unordered set of images are used.<n>Most of the video-to-video translation methods are built on an image-to-image translation model.<n>We propose a new generator network with feature warping layers which overcomes the limitations of the previous methods.
arXiv Detail & Related papers (2023-10-07T19:45:24Z)
Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images [58.67263739579952]
We present an automatic method that allows generating human cinemagraphs from single RGB images. At the core of our method is a novel cyclic neural network that produces looping cinemagraphs for the target loop duration. We evaluate our method on both synthetic and real data and demonstrate that it is possible to create compelling and plausible cinemagraphs from single RGB images.
arXiv Detail & Related papers (2023-03-15T14:09:35Z)
Animating Pictures with Eulerian Motion Fields [90.30598913855216]
We show a fully automatic method for converting a still image into a realistic animated looping video. We target scenes with continuous fluid motion, such as flowing water and billowing smoke. We propose a novel video looping technique that flows features both forward and backward in time and then blends the results.
arXiv Detail & Related papers (2020-11-30T18:59:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.