fMPI: Fast Novel View Synthesis in the Wild with Layered Scene
Representations
- URL: http://arxiv.org/abs/2312.16109v1
- Date: Tue, 26 Dec 2023 16:24:08 GMT
- Title: fMPI: Fast Novel View Synthesis in the Wild with Layered Scene
Representations
- Authors: Jonas Kohler, Nicolas Griffiths Sanchez, Luca Cavalli, Catherine
Herold, Albert Pumarola, Alberto Garcia Garcia, Ali Thabet
- Abstract summary: We propose two novel input processing paradigms for novel view synthesis (NVS) methods.
Our approach identifies and mitigates the two most time-consuming aspects of traditional pipelines.
We demonstrate that our proposed paradigms enable the design of an NVS method that achieves state-of-the-art on public benchmarks.
- Score: 9.75588035624177
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this study, we propose two novel input processing paradigms for novel view
synthesis (NVS) methods based on layered scene representations that
significantly improve their runtime without compromising quality. Our approach
identifies and mitigates the two most time-consuming aspects of traditional
pipelines: building and processing the so-called plane sweep volume (PSV),
which is a high-dimensional tensor of planar re-projections of the input camera
views. In particular, we propose processing this tensor in parallel groups for
improved compute efficiency as well as super-sampling adjacent input planes to
generate denser, and hence more accurate scene representation. The proposed
enhancements offer significant flexibility, allowing for a balance between
performance and speed, thus making substantial steps toward real-time
applications. Furthermore, they are very general in the sense that any
PSV-based method can make use of them, including methods that employ multiplane
images, multisphere images, and layered depth images. In a comprehensive set of
experiments, we demonstrate that our proposed paradigms enable the design of an
NVS method that achieves state-of-the-art on public benchmarks while being up
to $50x$ faster than existing state-of-the-art methods. It also beats the
current forerunner in terms of speed by over $3x$, while achieving
significantly better rendering quality.
Related papers
- Lightweight Multiplane Images Network for Real-Time Stereoscopic Conversion from Planar Video [29.199113565852645]
This paper proposes a real-time stereoscopic conversion network based on multi-plane images (MPI)
It employs a lightweight depth-semantic branch to extract depth-aware features implicitly.
It can achieve comparable performance to some state-of-the-art (SOTA) models and support real-time inference at 2K resolution.
arXiv Detail & Related papers (2024-12-04T08:04:14Z) - DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes [81.56206845824572]
Novel-view synthesis (NVS) approaches play a critical role in vast scene reconstruction.
Few-shot methods often struggle with poor reconstruction quality in vast environments.
This paper presents DGTR, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes.
arXiv Detail & Related papers (2024-11-19T07:51:44Z) - Efficient Depth-Guided Urban View Synthesis [52.841803876653465]
We introduce Efficient Depth-Guided Urban View Synthesis (EDUS) for fast feed-forward inference and efficient per-scene fine-tuning.
EDUS exploits noisy predicted geometric priors as guidance to enable generalizable urban view synthesis from sparse input images.
Our results indicate that EDUS achieves state-of-the-art performance in sparse view settings when combined with fast test-time optimization.
arXiv Detail & Related papers (2024-07-17T08:16:25Z) - OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control [66.03885917320189]
OrientDream is a camera orientation conditioned framework for efficient and multi-view consistent 3D generation from textual prompts.
Our strategy emphasizes the implementation of an explicit camera orientation conditioned feature in the pre-training of a 2D text-to-image diffusion module.
Our experiments reveal that our method not only produces high-quality NeRF models with consistent multi-view properties but also achieves an optimization speed significantly greater than existing methods.
arXiv Detail & Related papers (2024-06-14T13:16:18Z) - PASTA: Towards Flexible and Efficient HDR Imaging Via Progressively Aggregated Spatio-Temporal Alignment [91.38256332633544]
PASTA is a Progressively Aggregated Spatio-Temporal Alignment framework for HDR deghosting.
Our approach achieves effectiveness and efficiency by harnessing hierarchical representation during feature distanglement.
Experimental results showcase PASTA's superiority over current SOTA methods in both visual quality and performance metrics.
arXiv Detail & Related papers (2024-03-15T15:05:29Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - Fine Dense Alignment of Image Bursts through Camera Pose and Depth
Estimation [45.11207941777178]
This paper introduces a novel approach to the fine alignment of images in a burst captured by a handheld camera.
The proposed algorithm establishes dense correspondences by optimizing both the camera motion and surface depth and orientation at every pixel.
arXiv Detail & Related papers (2023-12-08T17:22:04Z) - Efficient-3DiM: Learning a Generalizable Single-image Novel-view
Synthesizer in One Day [63.96075838322437]
We propose a framework to learn a single-image novel-view synthesizer.
Our framework is able to reduce the total training time from 10 days to less than 1 day.
arXiv Detail & Related papers (2023-10-04T17:57:07Z) - Adaptive Multi-NeRF: Exploit Efficient Parallelism in Adaptive Multiple
Scale Neural Radiance Field Rendering [3.8200916793910973]
Recent advances in Neural Radiance Fields (NeRF) have demonstrated significant potential for representing 3D scene appearances as implicit neural networks.
However, the lengthy training and rendering process hinders the widespread adoption of this promising technique for real-time rendering applications.
We present an effective adaptive multi-NeRF method designed to accelerate the neural rendering process for large scenes.
arXiv Detail & Related papers (2023-10-03T08:34:49Z) - Multi-Plane Neural Radiance Fields for Novel View Synthesis [5.478764356647437]
Novel view synthesis is a long-standing problem that revolves around rendering frames of scenes from novel camera viewpoints.
In this work, we examine the performance, generalization, and efficiency of single-view multi-plane neural radiance fields.
We propose a new multiplane NeRF architecture that accepts multiple views to improve the synthesis results and expand the viewing range.
arXiv Detail & Related papers (2023-03-03T06:32:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.