FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models
- URL: http://arxiv.org/abs/2406.16863v1
- Date: Mon, 24 Jun 2024 17:59:56 GMT
- Title: FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models
- Authors: Haonan Qiu, Zhaoxi Chen, Zhouxia Wang, Yingqing He, Menghan Xia, Ziwei Liu,
- Abstract summary: We argue that diffusion model itself allows decent control over the generated content without requiring any training.
We introduce a tuning-free framework to achieve trajectory-controllable video generation, by imposing guidance on both noise construction and attention computation.
- Score: 41.006754386910686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion model has demonstrated remarkable capability in video generation, which further sparks interest in introducing trajectory control into the generation process. While existing works mainly focus on training-based methods (e.g., conditional adapter), we argue that diffusion model itself allows decent control over the generated content without requiring any training. In this study, we introduce a tuning-free framework to achieve trajectory-controllable video generation, by imposing guidance on both noise construction and attention computation. Specifically, 1) we first show several instructive phenomenons and analyze how initial noises influence the motion trajectory of generated content. 2) Subsequently, we propose FreeTraj, a tuning-free approach that enables trajectory control by modifying noise sampling and attention mechanisms. 3) Furthermore, we extend FreeTraj to facilitate longer and larger video generation with controllable trajectories. Equipped with these designs, users have the flexibility to provide trajectories manually or opt for trajectories automatically generated by the LLM trajectory planner. Extensive experiments validate the efficacy of our approach in enhancing the trajectory controllability of video diffusion models.
Related papers
- TraDiffusion: Trajectory-Based Training-Free Image Generation [85.39724878576584]
We propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion.
This novel method allows users to effortlessly guide image generation via mouse trajectories.
arXiv Detail & Related papers (2024-08-19T07:01:43Z) - Motion-Zero: Zero-Shot Moving Object Control Framework for
Diffusion-Based Video Generation [10.951376101606357]
We propose a novel zero-shot moving object trajectory control framework, Motion-Zero, to enable a bounding-box-trajectories-controlled text-to-video diffusion model.
Our method can be flexibly applied to various state-of-the-art video diffusion models without any training process.
arXiv Detail & Related papers (2024-01-18T17:22:37Z) - A Tricycle Model to Accurately Control an Autonomous Racecar with Locked
Differential [71.53284767149685]
We present a novel formulation to model the effects of a locked differential on the lateral dynamics of an autonomous open-wheel racecar.
We include a micro-steps discretization approach to accurately linearize the dynamics and produce a prediction suitable for real-time implementation.
arXiv Detail & Related papers (2023-12-22T16:29:55Z) - TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models [75.20168902300166]
We propose TrackDiffusion, a novel video generation framework affording fine-grained trajectory-conditioned motion control.
A pivotal component of TrackDiffusion is the instance enhancer, which explicitly ensures inter-frame consistency of multiple objects.
generated video sequences by our TrackDiffusion can be used as training data for visual perception models.
arXiv Detail & Related papers (2023-12-01T15:24:38Z) - DragNUWA: Fine-grained Control in Video Generation by Integrating Text,
Image, and Trajectory [126.4597063554213]
DragNUWA is an open-domain diffusion-based video generation model.
It provides fine-grained control over video content from semantic, spatial, and temporal perspectives.
Our experiments validate the effectiveness of DragNUWA, demonstrating its superior performance in fine-grained control in video generation.
arXiv Detail & Related papers (2023-08-16T01:43:41Z) - Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory
Diffusion [83.88829943619656]
We introduce a method for generating realistic pedestrian trajectories and full-body animations that can be controlled to meet user-defined goals.
Our guided diffusion model allows users to constrain trajectories through target waypoints, speed, and specified social groups.
We propose utilizing the value function learned during RL training of the animation controller to guide diffusion to produce trajectories better suited for particular scenarios.
arXiv Detail & Related papers (2023-04-04T15:46:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.