PrevizWhiz: Combining Rough 3D Scenes and 2D Video to Guide Generative Video Previsualization
- URL: http://arxiv.org/abs/2602.03838v1
- Date: Tue, 03 Feb 2026 18:56:40 GMT
- Title: PrevizWhiz: Combining Rough 3D Scenes and 2D Video to Guide Generative Video Previsualization
- Authors: Erzhen Hu, Frederik Brudy, David Ledo, George Fitzmaurice, Fraser Anderson,
- Abstract summary: We present PrevizWhiz, a system that leverages rough 3D scenes in combination with generative image and video models to create stylized video previews.<n>The system integrates frame-level image restyling with adjustable resemblance, time-based editing through motion paths or external video inputs, and refinement into high-fidelity video clips.
- Score: 10.681930120546438
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In pre-production, filmmakers and 3D animation experts must rapidly prototype ideas to explore a film's possibilities before fullscale production, yet conventional approaches involve trade-offs in efficiency and expressiveness. Hand-drawn storyboards often lack spatial precision needed for complex cinematography, while 3D previsualization demands expertise and high-quality rigged assets. To address this gap, we present PrevizWhiz, a system that leverages rough 3D scenes in combination with generative image and video models to create stylized video previews. The workflow integrates frame-level image restyling with adjustable resemblance, time-based editing through motion paths or external video inputs, and refinement into high-fidelity video clips. A study with filmmakers demonstrates that our system lowers technical barriers for film-makers, accelerates creative iteration, and effectively bridges the communication gap, while also surfacing challenges of continuity, authorship, and ethical consideration in AI-assisted filmmaking.
Related papers
- CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation [76.72787726497343]
We present CineMaster, a framework for 3D-aware and controllable text-to-video generation.<n>Our goal is to empower users with comparable controllability as professional film directors.
arXiv Detail & Related papers (2025-02-12T18:55:36Z) - Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes [49.26872036160368]
We propose a method for animating parts of high-quality 3D scenes in a Gaussian Splatting representation.<n>We find that, in contrast to prior work, this enables realistic animations of complex, pre-existing 3D scenes.
arXiv Detail & Related papers (2024-11-28T16:01:58Z) - ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis [63.169364481672915]
We propose textbfViewCrafter, a novel method for synthesizing high-fidelity novel views of generic scenes from single or sparse images.
Our method takes advantage of the powerful generation capabilities of video diffusion model and the coarse 3D clues offered by point-based representation to generate high-quality video frames.
arXiv Detail & Related papers (2024-09-03T16:53:19Z) - DreamCinema: Cinematic Transfer with Free Camera and 3D Character [51.56284525225804]
We propose a new framework for film creation, Dream-Cinema, which is designed for user-friendly, 3D space-based film creation with generative models.<n>We decompose 3D film creation into four key elements: 3D character, driven motion, camera movement, and environment.<n>To seamlessly recombine these elements and ensure smooth film creation, we propose structure-guided character animation, shape-aware camera movement optimization, and environment-aware generative refinement.
arXiv Detail & Related papers (2024-08-22T17:59:44Z) - Generative Rendering: Controllable 4D-Guided Video Generation with 2D
Diffusion Models [40.71940056121056]
We present a novel approach that combines the controllability of dynamic 3D meshes with the expressivity and editability of emerging diffusion models.
We demonstrate our approach on various examples where motion can be obtained by animating rigged assets or changing the camera path.
arXiv Detail & Related papers (2023-12-03T14:17:11Z) - Cinematic Behavior Transfer via NeRF-based Differentiable Filming [63.1622492808519]
Existing SLAM methods face limitations in dynamic scenes and human pose estimation often focuses on 2D projections.
We first introduce a reverse filming behavior estimation technique.
We then introduce a cinematic transfer pipeline that is able to transfer various shot types to a new 2D video or a 3D virtual environment.
arXiv Detail & Related papers (2023-11-29T15:56:58Z) - 3D-Aware Video Generation [149.5230191060692]
We explore 4D generative adversarial networks (GANs) that learn generation of 3D-aware videos.
By combining neural implicit representations with time-aware discriminator, we develop a GAN framework that synthesizes 3D video supervised only with monocular videos.
arXiv Detail & Related papers (2022-06-29T17:56:03Z) - DuctTake: Spatiotemporal Video Compositing [28.154654576394112]
Our method instead composites shots together by finding optimal detail using motion-compensated cuts.
We validate our approach by presenting a wide variety of examples and by comparing quality and creation time to composites made by professional artists.
arXiv Detail & Related papers (2021-01-12T21:58:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.