Dynamic Storyboard Generation in an Engine-based Virtual Environment for
Video Production
- URL: http://arxiv.org/abs/2301.12688v3
- Date: Fri, 21 Jul 2023 18:13:10 GMT
- Title: Dynamic Storyboard Generation in an Engine-based Virtual Environment for
Video Production
- Authors: Anyi Rao, Xuekun Jiang, Yuwei Guo, Linning Xu, Lei Yang, Libiao Jin,
Dahua Lin, Bo Dai
- Abstract summary: We present Virtual Dynamic Storyboard (VDS) to allow users storyboarding shots in virtual environments.
VDS runs on a "propose-simulate-discriminate" mode: Given a formatted story script and a camera script as input, it generates several character animation and camera movement proposals.
To pick up the top-quality dynamic storyboard from the candidates, we equip it with a shot ranking discriminator based on shot quality criteria learned from professional manual-created data.
- Score: 92.14891282042764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Amateurs working on mini-films and short-form videos usually spend lots of
time and effort on the multi-round complicated process of setting and adjusting
scenes, plots, and cameras to deliver satisfying video shots. We present
Virtual Dynamic Storyboard (VDS) to allow users storyboarding shots in virtual
environments, where the filming staff can easily test the settings of shots
before the actual filming. VDS runs on a "propose-simulate-discriminate" mode:
Given a formatted story script and a camera script as input, it generates
several character animation and camera movement proposals following predefined
story and cinematic rules to allow an off-the-shelf simulation engine to render
videos. To pick up the top-quality dynamic storyboard from the candidates, we
equip it with a shot ranking discriminator based on shot quality criteria
learned from professional manual-created data. VDS is comprehensively validated
via extensive experiments and user studies, demonstrating its efficiency,
effectiveness, and great potential in assisting amateur video production.
Related papers
- Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation [4.147294190096431]
We introduce an automatic synthetic video generation pipeline based on Vision Large Language Model (VLM) agent collaborations.
Given a natural language description of a video, multiple VLM agents auto-direct various processes of the generation pipeline.
Our generated videos show better quality than commercial video generation models in 5 metrics on video quality and instruction-following performance.
arXiv Detail & Related papers (2024-08-19T23:31:02Z) - WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models [132.77237314239025]
Video virtual try-on aims to generate realistic sequences that maintain garment identity and adapt to a person's pose and body shape in source videos.
Traditional image-based methods, relying on warping and blending, struggle with complex human movements and occlusions.
We reconceptualize video try-on as a process of generating videos conditioned on garment descriptions and human motion.
Our solution, WildVidFit, employs image-based controlled diffusion models for a streamlined, one-stage approach.
arXiv Detail & Related papers (2024-07-15T11:21:03Z) - Image Conductor: Precision Control for Interactive Video Synthesis [90.2353794019393]
Filmmaking and animation production often require sophisticated techniques for coordinating camera transitions and object movements.
Image Conductor is a method for precise control of camera transitions and object movements to generate video assets from a single image.
arXiv Detail & Related papers (2024-06-21T17:55:05Z) - Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis [43.02778060969546]
We propose a controllable monocular dynamic view synthesis pipeline.
Our model does not require depth as input, and does not explicitly model 3D scene geometry.
We believe our framework can potentially unlock powerful applications in rich dynamic scene understanding, perception for robotics, and interactive 3D video viewing experiences for virtual reality.
arXiv Detail & Related papers (2024-05-23T17:59:52Z) - Cinematic Behavior Transfer via NeRF-based Differentiable Filming [63.1622492808519]
Existing SLAM methods face limitations in dynamic scenes and human pose estimation often focuses on 2D projections.
We first introduce a reverse filming behavior estimation technique.
We then introduce a cinematic transfer pipeline that is able to transfer various shot types to a new 2D video or a 3D virtual environment.
arXiv Detail & Related papers (2023-11-29T15:56:58Z) - MovieFactory: Automatic Movie Creation from Text using Large Generative
Models for Language and Images [92.13079696503803]
We present MovieFactory, a framework to generate cinematic-picture (3072$times$1280), film-style (multi-scene), and multi-modality (sounding) movies.
Our approach empowers users to create captivating movies with smooth transitions using simple text inputs.
arXiv Detail & Related papers (2023-06-12T17:31:23Z) - Sampling Based Scene-Space Video Processing [89.49726406622842]
We present a novel, sampling-based framework for processing video.
It enables high-quality scene-space video effects in the presence of inevitable errors in depth and camera pose estimation.
We present results for various casually captured, hand-held, moving, compressed, monocular videos.
arXiv Detail & Related papers (2021-02-05T05:55:04Z) - Batteries, camera, action! Learning a semantic control space for
expressive robot cinematography [15.895161373307378]
We develop a data-driven framework that enables editing of complex camera positioning parameters in a semantic space.
First, we generate a database of video clips with a diverse range of shots in a photo-realistic simulator.
We use hundreds of participants in a crowd-sourcing framework to obtain scores for a set of semantic descriptors for each clip.
arXiv Detail & Related papers (2020-11-19T21:56:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.