Deep 3D Mask Volume for View Synthesis of Dynamic Scenes
- URL: http://arxiv.org/abs/2108.13408v1
- Date: Mon, 30 Aug 2021 17:55:28 GMT
- Title: Deep 3D Mask Volume for View Synthesis of Dynamic Scenes
- Authors: Kai-En Lin and Lei Xiao and Feng Liu and Guowei Yang and Ravi
Ramamoorthi
- Abstract summary: We introduce a multi-view video dataset, captured with a custom 10-camera rig in 120FPS.
The dataset contains 96 high-quality scenes showing various visual effects and human interactions in outdoor scenes.
We develop a new algorithm, Deep 3D Mask Volume, which enables temporally-stable view extrapolation from binocular videos of dynamic scenes, captured by static cameras.
- Score: 49.45028543279115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image view synthesis has seen great success in reconstructing photorealistic
visuals, thanks to deep learning and various novel representations. The next
key step in immersive virtual experiences is view synthesis of dynamic scenes.
However, several challenges exist due to the lack of high-quality training
datasets, and the additional time dimension for videos of dynamic scenes. To
address this issue, we introduce a multi-view video dataset, captured with a
custom 10-camera rig in 120FPS. The dataset contains 96 high-quality scenes
showing various visual effects and human interactions in outdoor scenes. We
develop a new algorithm, Deep 3D Mask Volume, which enables temporally-stable
view extrapolation from binocular videos of dynamic scenes, captured by static
cameras. Our algorithm addresses the temporal inconsistency of disocclusions by
identifying the error-prone areas with a 3D mask volume, and replaces them with
static background observed throughout the video. Our method enables
manipulation in 3D space as opposed to simple 2D masks, We demonstrate better
temporal stability than frame-by-frame static view synthesis methods, or those
that use 2D masks. The resulting view synthesis videos show minimal flickering
artifacts and allow for larger translational movements.
Related papers
- Shape of Motion: 4D Reconstruction from a Single Video [51.04575075620677]
We introduce a method capable of reconstructing generic dynamic scenes, featuring explicit, full-sequence-long 3D motion.
We exploit the low-dimensional structure of 3D motion by representing scene motion with a compact set of SE3 motion bases.
Our method achieves state-of-the-art performance for both long-range 3D/2D motion estimation and novel view synthesis on dynamic scenes.
arXiv Detail & Related papers (2024-07-18T17:59:08Z) - Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis [43.02778060969546]
We propose a controllable monocular dynamic view synthesis pipeline.
Our model does not require depth as input, and does not explicitly model 3D scene geometry.
We believe our framework can potentially unlock powerful applications in rich dynamic scene understanding, perception for robotics, and interactive 3D video viewing experiences for virtual reality.
arXiv Detail & Related papers (2024-05-23T17:59:52Z) - Fast View Synthesis of Casual Videos with Soup-of-Planes [24.35962788109883]
Novel view synthesis from an in-the-wild video is difficult due to challenges like scene dynamics and lack of parallax.
This paper revisits explicit video representations to synthesize high-quality novel views from a monocular video efficiently.
Our method can render high-quality novel views from an in-the-wild video with comparable quality to state-of-the-art methods while being 100x faster in training and enabling real-time rendering.
arXiv Detail & Related papers (2023-12-04T18:55:48Z) - Decoupling Dynamic Monocular Videos for Dynamic View Synthesis [50.93409250217699]
We tackle the challenge of dynamic view synthesis from dynamic monocular videos in an unsupervised fashion.
Specifically, we decouple the motion of the dynamic objects into object motion and camera motion, respectively regularized by proposed unsupervised surface consistency and patch-based multi-view constraints.
arXiv Detail & Related papers (2023-04-04T11:25:44Z) - Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes.
We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature.
We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z) - Non-Rigid Neural Radiance Fields: Reconstruction and Novel View
Synthesis of a Dynamic Scene From Monocular Video [76.19076002661157]
Non-Rigid Neural Radiance Fields (NR-NeRF) is a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes.
We show that even a single consumer-grade camera is sufficient to synthesize sophisticated renderings of a dynamic scene from novel virtual camera views.
arXiv Detail & Related papers (2020-12-22T18:46:12Z) - Neural Radiance Flow for 4D View Synthesis and Video Processing [59.9116932930108]
We present a method to learn a 4D spatial-temporal representation of a dynamic scene from a set of RGB images.
Key to our approach is the use of a neural implicit representation that learns to capture the 3D occupancy, radiance, and dynamics of the scene.
arXiv Detail & Related papers (2020-12-17T17:54:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.