Mixed Neural Voxels for Fast Multi-view Video Synthesis
- URL: http://arxiv.org/abs/2212.00190v2
- Date: Fri, 6 Oct 2023 22:11:43 GMT
- Title: Mixed Neural Voxels for Fast Multi-view Video Synthesis
- Authors: Feng Wang, Sinan Tan, Xinghang Li, Zeyue Tian, Yafei Song and Huaping
Liu
- Abstract summary: We present a novel method named MixVoxels to better represent the dynamic scenes with fast training speed and competitive rendering qualities.
The proposed MixVoxels represents the 4D dynamic scenes as a mixture of static and dynamic voxels and processes them with different networks.
With 15 minutes of training for dynamic scenes with inputs of 300-frame videos, MixVoxels achieves better PSNR than previous methods.
- Score: 16.25013978657888
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Synthesizing high-fidelity videos from real-world multi-view input is
challenging because of the complexities of real-world environments and highly
dynamic motions. Previous works based on neural radiance fields have
demonstrated high-quality reconstructions of dynamic scenes. However, training
such models on real-world scenes is time-consuming, usually taking days or
weeks. In this paper, we present a novel method named MixVoxels to better
represent the dynamic scenes with fast training speed and competitive rendering
qualities. The proposed MixVoxels represents the 4D dynamic scenes as a mixture
of static and dynamic voxels and processes them with different networks. In
this way, the computation of the required modalities for static voxels can be
processed by a lightweight model, which essentially reduces the amount of
computation, especially for many daily dynamic scenes dominated by the static
background. To separate the two kinds of voxels, we propose a novel variation
field to estimate the temporal variance of each voxel. For the dynamic voxels,
we design an inner-product time query method to efficiently query multiple time
steps, which is essential to recover the high-dynamic motions. As a result,
with 15 minutes of training for dynamic scenes with inputs of 300-frame videos,
MixVoxels achieves better PSNR than previous methods. Codes and trained models
are available at https://github.com/fengres/mixvoxels
Related papers
- Multi-Level Neural Scene Graphs for Dynamic Urban Environments [64.26401304233843]
We present a novel, decomposable radiance field approach for dynamic urban environments.
We propose a multi-level neural scene graph representation that scales to thousands of images from dozens of sequences with hundreds of fast-moving objects.
arXiv Detail & Related papers (2024-03-29T21:52:01Z) - DynMF: Neural Motion Factorization for Real-time Dynamic View Synthesis
with 3D Gaussian Splatting [35.69069478773709]
We argue that the per-point motions of a dynamic scene can be decomposed into a small set of explicit or learned trajectories.
Our representation is interpretable, efficient, and expressive enough to offer real-time view synthesis of complex dynamic scene motions.
arXiv Detail & Related papers (2023-11-30T18:59:11Z) - OD-NeRF: Efficient Training of On-the-Fly Dynamic Neural Radiance Fields [63.04781030984006]
Dynamic neural radiance fields (dynamic NeRFs) have demonstrated impressive results in novel view synthesis on 3D dynamic scenes.
We propose OD-NeRF to efficiently train and render dynamic NeRFs on-the-fly which instead is capable of streaming the dynamic scene.
Our algorithm can achieve an interactive speed of 6FPS training and rendering on synthetic dynamic scenes on-the-fly, and a significant speed-up compared to the state-of-the-art on real-world dynamic scenes.
arXiv Detail & Related papers (2023-05-24T07:36:47Z) - Learning Neural Volumetric Representations of Dynamic Humans in Minutes [49.10057060558854]
We propose a novel method for learning neural volumetric videos of dynamic humans from sparse view videos in minutes with competitive visual quality.
Specifically, we define a novel part-based voxelized human representation to better distribute the representational power of the network to different human parts.
Experiments demonstrate that our model can be learned 100 times faster than prior per-scene optimization methods.
arXiv Detail & Related papers (2023-02-23T18:57:01Z) - DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene.
We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views.
We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z) - Neural Deformable Voxel Grid for Fast Optimization of Dynamic View
Synthesis [63.25919018001152]
We propose a fast deformable radiance field method to handle dynamic scenes.
Our method achieves comparable performance to D-NeRF using only 20 minutes for training.
arXiv Detail & Related papers (2022-06-15T17:49:08Z) - DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes [27.37830742693236]
We present DeVRF, a novel representation to accelerate learning dynamic radiance fields.
Experiments demonstrate that DeVRF achieves two orders of magnitude speedup with on-par high-fidelity results.
arXiv Detail & Related papers (2022-05-31T12:13:54Z) - Fast Dynamic Radiance Fields with Time-Aware Neural Voxels [106.69049089979433]
We propose a radiance field framework by representing scenes with time-aware voxel features, named as TiNeuVox.
Our framework accelerates the optimization of dynamic radiance fields while maintaining high rendering quality.
Our TiNeuVox completes training with only 8 minutes and 8-MB storage cost while showing similar or even better rendering performance than previous dynamic NeRF methods.
arXiv Detail & Related papers (2022-05-30T17:47:31Z) - Neural 3D Video Synthesis [18.116032726623608]
We propose a novel approach for 3D video synthesis that is able to represent multi-view video recordings of a dynamic real-world scene.
Our approach takes the high quality and compactness of static neural radiance fields in a new direction: to a model-free, dynamic setting.
We demonstrate that our method can render high-fidelity wide-angle novel views at over 1K resolution, even for highly complex and dynamic scenes.
arXiv Detail & Related papers (2021-03-03T18:47:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.