Self-supervised novel 2D view synthesis of large-scale scenes with
efficient multi-scale voxel carving
- URL: http://arxiv.org/abs/2306.14709v1
- Date: Mon, 26 Jun 2023 13:57:05 GMT
- Title: Self-supervised novel 2D view synthesis of large-scale scenes with
efficient multi-scale voxel carving
- Authors: Alexandra Budisteanu, Dragos Costea, Alina Marcu and Marius Leordeanu
- Abstract summary: We introduce an efficient multi-scale voxel carving method to generate novel views of real scenes.
Our final high-resolution output is efficiently self-trained on data automatically generated by the voxel carving module.
We demonstrate the effectiveness of our method on highly complex and large-scale scenes in real environments.
- Score: 77.07589573960436
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of generating novel views of real scenes is increasingly important
nowadays when AI models become able to create realistic new worlds. In many
practical applications, it is important for novel view synthesis methods to
stay grounded in the physical world as much as possible, while also being able
to imagine it from previously unseen views. While most current methods are
developed and tested in virtual environments with small scenes and no errors in
pose and depth information, we push the boundaries to the real-world domain of
large scales in the new context of UAVs. Our algorithmic contributions are two
folds. First, we manage to stay anchored in the real 3D world, by introducing
an efficient multi-scale voxel carving method, which is able to accommodate
significant noises in pose, depth, and illumination variations, while being
able to reconstruct the view of the world from drastically different poses at
test time. Second, our final high-resolution output is efficiently self-trained
on data automatically generated by the voxel carving module, which gives it the
flexibility to adapt efficiently to any scene. We demonstrated the
effectiveness of our method on highly complex and large-scale scenes in real
environments while outperforming the current state-of-the-art. Our code is
publicly available: https://github.com/onorabil/MSVC.
Related papers
- Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - Closing the Visual Sim-to-Real Gap with Object-Composable NeRFs [59.12526668734703]
We introduce Composable Object Volume NeRF (COV-NeRF), an object-composable NeRF model that is the centerpiece of a real-to-sim pipeline.
COV-NeRF extracts objects from real images and composes them into new scenes, generating photorealistic renderings and many types of 2D and 3D supervision.
arXiv Detail & Related papers (2024-03-07T00:00:02Z) - Real-Time Neural Rasterization for Large Scenes [39.198327570559684]
We propose a new method for realistic real-time novel-view synthesis of large scenes.
Existing neural rendering methods generate realistic results, but primarily work for small scale scenes.
Our work is the first to enable real-time rendering of large real-world scenes.
arXiv Detail & Related papers (2023-11-09T18:59:10Z) - NSLF-OL: Online Learning of Neural Surface Light Fields alongside
Real-time Incremental 3D Reconstruction [0.76146285961466]
The paper proposes a novel Neural Surface Light Fields model that copes with the small range of view directions while producing a good result in unseen directions.
Our model learns online the Neural Surface Light Fields (NSLF) aside from real-time 3D reconstruction with a sequential data stream as the shared input.
In addition to online training, our model also provides real-time rendering after completing the data stream for visualization.
arXiv Detail & Related papers (2023-04-29T15:41:15Z) - NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion
Models [85.20004959780132]
We introduce NeuralField-LDM, a generative model capable of synthesizing complex 3D environments.
We show how NeuralField-LDM can be used for a variety of 3D content creation applications, including conditional scene generation, scene inpainting and scene style manipulation.
arXiv Detail & Related papers (2023-04-19T16:13:21Z) - DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene.
We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views.
We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z) - DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes [27.37830742693236]
We present DeVRF, a novel representation to accelerate learning dynamic radiance fields.
Experiments demonstrate that DeVRF achieves two orders of magnitude speedup with on-par high-fidelity results.
arXiv Detail & Related papers (2022-05-31T12:13:54Z) - Evaluating Continual Learning Algorithms by Generating 3D Virtual
Environments [66.83839051693695]
Continual learning refers to the ability of humans and animals to incrementally learn over time in a given environment.
We propose to leverage recent advances in 3D virtual environments in order to approach the automatic generation of potentially life-long dynamic scenes with photo-realistic appearance.
A novel element of this paper is that scenes are described in a parametric way, thus allowing the user to fully control the visual complexity of the input stream the agent perceives.
arXiv Detail & Related papers (2021-09-16T10:37:21Z) - Learning Compositional Radiance Fields of Dynamic Human Heads [13.272666180264485]
We propose a novel compositional 3D representation that combines the best of previous methods to produce both higher-resolution and faster results.
Differentiable volume rendering is employed to compute photo-realistic novel views of the human head and upper body.
Our approach achieves state-of-the-art results for synthesizing novel views of dynamic human heads and the upper body.
arXiv Detail & Related papers (2020-12-17T22:19:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.