Related papers: DreamCinema: Cinematic Transfer with Free Camera and 3D Character

DreamCinema: Cinematic Transfer with Free Camera and 3D Character

URL: http://arxiv.org/abs/2408.12601v2
Date: Wed, 02 Jul 2025 06:39:01 GMT
Title: DreamCinema: Cinematic Transfer with Free Camera and 3D Character
Authors: Weiliang Chen, Fangfu Liu, Diankun Wu, Haowen Sun, Jiwen Lu, Yueqi Duan,
Abstract summary: We propose a new framework for film creation, Dream-Cinema, which is designed for user-friendly, 3D space-based film creation with generative models.<n>We decompose 3D film creation into four key elements: 3D character, driven motion, camera movement, and environment.<n>To seamlessly recombine these elements and ensure smooth film creation, we propose structure-guided character animation, shape-aware camera movement optimization, and environment-aware generative refinement.
Score: 51.56284525225804
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We are living in a flourishing era of digital media, where everyone has the potential to become a personal filmmaker. Current research on video generation suggests a promising avenue for controllable film creation in pixel space using Diffusion models. However, the reliance on overly verbose prompts and insufficient focus on cinematic elements (e.g., camera movement) results in videos that lack cinematic quality. Furthermore, the absence of 3D modeling often leads to failures in video generation, such as inconsistent character models at different frames, ultimately hindering the immersive experience for viewers. In this paper, we propose a new framework for film creation, Dream-Cinema, which is designed for user-friendly, 3D space-based film creation with generative models. Specifically, we decompose 3D film creation into four key elements: 3D character, driven motion, camera movement, and environment. We extract the latter three elements from user-specified film shots and generate the 3D character using a generative model based on a provided image. To seamlessly recombine these elements and ensure smooth film creation, we propose structure-guided character animation, shape-aware camera movement optimization, and environment-aware generative refinement. Extensive experiments demonstrate the effectiveness of our method in generating high-quality films with free camera and 3D characters.

Related papers

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography [98.28272367169465]
We introduce an auto-regressive model inspired by the expertise of Directors of Photography to generate artistic and expressive camera trajectories. Thanks to the comprehensive and diverse database, we train an auto-regressive, decoder-only Transformer for high-quality, context-aware camera movement generation. Experiments demonstrate that compared to existing methods, GenDoP offers better controllability, finer-grained trajectory adjustments, and higher motion stability.
arXiv Detail & Related papers (2025-04-09T17:56:01Z)
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation [76.72787726497343]
We present CineMaster, a framework for 3D-aware and controllable text-to-video generation. Our goal is to empower users with comparable controllability as professional film directors.
arXiv Detail & Related papers (2025-02-12T18:55:36Z)
Deblur-Avatar: Animatable Avatars from Motion-Blurred Monocular Videos [64.10307207290039]
We introduce a novel framework for modeling high-fidelity, animatable 3D human avatars from motion-blurred monocular video inputs. By explicitly modeling human motion trajectories during exposure time, we jointly optimize the trajectories and 3D Gaussians to reconstruct sharp, high-quality human avatars.
arXiv Detail & Related papers (2025-01-23T02:31:57Z)
Can video generation replace cinematographers? Research on the cinematic language of generated video [31.0131670022777]
We propose a threefold approach to improve cinematic control in text-to-video (T2V) models. First, we introduce a meticulously annotated cinematic language dataset with twenty subcategories, covering shot framing, shot angles, and camera movements. Second, we present CameraDiff, which employs LoRA for precise and stable cinematic control, ensuring flexible shot generation. Third, we propose CameraCLIP, designed to evaluate cinematic alignment and guide multi-shot composition.
arXiv Detail & Related papers (2024-12-16T09:02:24Z)
Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes [49.26872036160368]
We propose a method for animating parts of high-quality 3D scenes in a Gaussian Splatting representation. We find that, in contrast to prior work, this enables realistic animations of complex, pre-existing 3D scenes.
arXiv Detail & Related papers (2024-11-28T16:01:58Z)
Motion Diffusion-Guided 3D Global HMR from a Dynamic Camera [3.6948631725065355]
We present DiffOpt, a novel 3D global HMR method using Diffusion Optimization. Our key insight is that recent advances in human motion generation, such as the motion diffusion model (MDM), contain a strong prior of coherent human motion. We validate DiffOpt with video sequences from the Electromagnetic Database of Global 3D Human Pose and Shape in the Wild.
arXiv Detail & Related papers (2024-11-15T21:09:40Z)
ChatCam: Empowering Camera Control through Conversational AI [67.31920821192323]
ChatCam is a system that navigates camera movements through conversations with users. To achieve this, we propose CineGPT, a GPT-based autoregressive model for text-conditioned camera trajectory generation. We also develop an Anchor Determinator to ensure precise camera trajectory placement.
arXiv Detail & Related papers (2024-09-25T20:13:41Z)
Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models [40.71940056121056]
We present a novel approach that combines the controllability of dynamic 3D meshes with the expressivity and editability of emerging diffusion models. We demonstrate our approach on various examples where motion can be obtained by animating rigged assets or changing the camera path.
arXiv Detail & Related papers (2023-12-03T14:17:11Z)
Cinematic Behavior Transfer via NeRF-based Differentiable Filming [63.1622492808519]
Existing SLAM methods face limitations in dynamic scenes and human pose estimation often focuses on 2D projections. We first introduce a reverse filming behavior estimation technique. We then introduce a cinematic transfer pipeline that is able to transfer various shot types to a new 2D video or a 3D virtual environment.
arXiv Detail & Related papers (2023-11-29T15:56:58Z)
Automatic Camera Trajectory Control with Enhanced Immersion for Virtual Cinematography [23.070207691087827]
Real-world cinematographic rules show that directors can create immersion by comprehensively synchronizing the camera with the actor. Inspired by this strategy, we propose a deep camera control framework that enables actor-camera synchronization in three aspects. Our proposed method yields immersive cinematic videos of high quality, both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-03-29T22:02:15Z)
AgileGAN3D: Few-Shot 3D Portrait Stylization by Augmented Transfer Learning [80.67196184480754]
We propose a novel framework emphAgileGAN3D that can produce 3D artistically appealing portraits with detailed geometry. New stylization can be obtained with just a few (around 20) unpaired 2D exemplars. Our pipeline demonstrates strong capability in turning user photos into a diverse range of 3D artistic portraits.
arXiv Detail & Related papers (2023-03-24T23:04:20Z)
3D Cinemagraphy from a Single Image [73.09720823592092]
We present 3D Cinemagraphy, a new technique that marries 2D image animation with 3D photography. Given a single still image as input, our goal is to generate a video that contains both visual content animation and camera motion.
arXiv Detail & Related papers (2023-03-10T06:08:23Z)
Decoupling Human and Camera Motion from Videos in the Wild [67.39432972193929]
We propose a method to reconstruct global human trajectories from videos in the wild. Our method decouples the camera and human motion, which allows us to place people in the same world coordinate frame.
arXiv Detail & Related papers (2023-02-24T18:59:15Z)
PV3D: A 3D Generative Model for Portrait Video Generation [94.96025739097922]
We propose PV3D, the first generative framework that can synthesize multi-view consistent portrait videos. PV3D is able to support many downstream applications such as animating static portraits and view-consistent video motion editing.
arXiv Detail & Related papers (2022-12-13T05:42:44Z)
3D-Aware Video Generation [149.5230191060692]
We explore 4D generative adversarial networks (GANs) that learn generation of 3D-aware videos. By combining neural implicit representations with time-aware discriminator, we develop a GAN framework that synthesizes 3D video supervised only with monocular videos.
arXiv Detail & Related papers (2022-06-29T17:56:03Z)
Action2video: Generating Videos of Human 3D Actions [31.665831044217363]
We aim to tackle the interesting yet challenging problem of generating videos of diverse and natural human motions from prescribed action categories. Key issue lies in the ability to synthesize multiple distinct motion sequences that are realistic in their visual appearances. Action2motionally generates plausible 3D pose sequences of a prescribed action category, which are processed and rendered by motion2video to form 2D videos.
arXiv Detail & Related papers (2021-11-12T20:20:37Z)
Learning Motion Priors for 4D Human Body Capture in 3D Scenes [81.54377747405812]
We propose LEMO: LEarning human MOtion priors for 4D human body capture. We introduce a novel motion prior, which reduces the jitters exhibited by poses recovered over a sequence. We also design a contact friction term and a contact-aware motion infiller obtained via per-instance self-supervised training. With our pipeline, we demonstrate high-quality 4D human body capture, reconstructing smooth motions and physically plausible body-scene interactions.
arXiv Detail & Related papers (2021-08-23T20:47:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.