Diffusion Priors for Dynamic View Synthesis from Monocular Videos
- URL: http://arxiv.org/abs/2401.05583v1
- Date: Wed, 10 Jan 2024 23:26:41 GMT
- Title: Diffusion Priors for Dynamic View Synthesis from Monocular Videos
- Authors: Chaoyang Wang, Peiye Zhuang, Aliaksandr Siarohin, Junli Cao, Guocheng
Qian, Hsin-Ying Lee, Sergey Tulyakov
- Abstract summary: Dynamic novel view synthesis aims to capture the temporal evolution of visual content within videos.
We first finetune a pretrained RGB-D diffusion model on the video frames using a customization technique.
We distill the knowledge from the finetuned model to a 4D representations encompassing both dynamic and static Neural Radiance Fields.
- Score: 59.42406064983643
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dynamic novel view synthesis aims to capture the temporal evolution of visual
content within videos. Existing methods struggle to distinguishing between
motion and structure, particularly in scenarios where camera poses are either
unknown or constrained compared to object motion. Furthermore, with information
solely from reference images, it is extremely challenging to hallucinate unseen
regions that are occluded or partially observed in the given videos. To address
these issues, we first finetune a pretrained RGB-D diffusion model on the video
frames using a customization technique. Subsequently, we distill the knowledge
from the finetuned model to a 4D representations encompassing both dynamic and
static Neural Radiance Fields (NeRF) components. The proposed pipeline achieves
geometric consistency while preserving the scene identity. We perform thorough
experiments to evaluate the efficacy of the proposed method qualitatively and
quantitatively. Our results demonstrate the robustness and utility of our
approach in challenging cases, further advancing dynamic novel view synthesis.
Related papers
- DyBluRF: Dynamic Neural Radiance Fields from Blurry Monocular Video [18.424138608823267]
We propose DyBluRF, a dynamic radiance field approach that synthesizes sharp novel views from a monocular video affected by motion blur.
To account for motion blur in input images, we simultaneously capture the camera trajectory and object Discrete Cosine Transform (DCT) trajectories within the scene.
arXiv Detail & Related papers (2024-03-15T08:48:37Z) - DyBluRF: Dynamic Deblurring Neural Radiance Fields for Blurry Monocular Video [25.964642223641057]
We propose a novel dynamic deblurring framework for blurry video monocular video, called DyBluRF.
Our DyBluRF is the first that handles the novel view synthesis for blurry monocular video with a novel two-stage framework.
arXiv Detail & Related papers (2023-12-21T02:01:19Z) - Consistent View Synthesis with Pose-Guided Diffusion Models [51.37925069307313]
Novel view synthesis from a single image has been a cornerstone problem for many Virtual Reality applications.
We propose a pose-guided diffusion model to generate a consistent long-term video of novel views from a single image.
arXiv Detail & Related papers (2023-03-30T17:59:22Z) - Dance In the Wild: Monocular Human Animation with Neural Dynamic
Appearance Synthesis [56.550999933048075]
We propose a video based synthesis method that tackles challenges and demonstrates high quality results for in-the-wild videos.
We introduce a novel motion signature that is used to modulate the generator weights to capture dynamic appearance changes.
We evaluate our method on a set of challenging videos and show that our approach achieves state-of-the art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-11-10T20:18:57Z) - Dynamic View Synthesis from Dynamic Monocular Video [69.80425724448344]
We present an algorithm for generating views at arbitrary viewpoints and any input time step given a monocular video of a dynamic scene.
We show extensive quantitative and qualitative results of dynamic view synthesis from casually captured videos.
arXiv Detail & Related papers (2021-05-13T17:59:50Z) - Non-Rigid Neural Radiance Fields: Reconstruction and Novel View
Synthesis of a Dynamic Scene From Monocular Video [76.19076002661157]
Non-Rigid Neural Radiance Fields (NR-NeRF) is a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes.
We show that even a single consumer-grade camera is sufficient to synthesize sophisticated renderings of a dynamic scene from novel virtual camera views.
arXiv Detail & Related papers (2020-12-22T18:46:12Z) - Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes [70.76742458931935]
We introduce a new representation that models the dynamic scene as a time-variant continuous function of appearance, geometry, and 3D scene motion.
Our representation is optimized through a neural network to fit the observed input views.
We show that our representation can be used for complex dynamic scenes, including thin structures, view-dependent effects, and natural degrees of motion.
arXiv Detail & Related papers (2020-11-26T01:23:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.