360FusionNeRF: Panoramic Neural Radiance Fields with Joint Guidance
- URL: http://arxiv.org/abs/2209.14265v1
- Date: Wed, 28 Sep 2022 17:30:53 GMT
- Title: 360FusionNeRF: Panoramic Neural Radiance Fields with Joint Guidance
- Authors: Shreyas Kulkarni, Peng Yin, and Sebastian Scherer
- Abstract summary: We present a method to synthesize novel views from a single $360circ$ panorama image based on the neural radiance field (NeRF)
We propose 360FusionNeRF, a semi-supervised learning framework where we introduce geometric supervision and semantic consistency to guide the training process.
- Score: 6.528382036284374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a method to synthesize novel views from a single $360^\circ$
panorama image based on the neural radiance field (NeRF). Prior studies in a
similar setting rely on the neighborhood interpolation capability of
multi-layer perceptions to complete missing regions caused by occlusion, which
leads to artifacts in their predictions. We propose 360FusionNeRF, a
semi-supervised learning framework where we introduce geometric supervision and
semantic consistency to guide the progressive training process. Firstly, the
input image is re-projected to $360^\circ$ images, and auxiliary depth maps are
extracted at other camera positions. The depth supervision, in addition to the
NeRF color guidance, improves the geometry of the synthesized views.
Additionally, we introduce a semantic consistency loss that encourages
realistic renderings of novel views. We extract these semantic features using a
pre-trained visual encoder such as CLIP, a Vision Transformer trained on
hundreds of millions of diverse 2D photographs mined from the web with natural
language supervision. Experiments indicate that our proposed method can produce
plausible completions of unobserved regions while preserving the features of
the scene. When trained across various scenes, 360FusionNeRF consistently
achieves the state-of-the-art performance when transferring to synthetic
Structured3D dataset (PSNR~5%, SSIM~3% LPIPS~13%), real-world Matterport3D
dataset (PSNR~3%, SSIM~3% LPIPS~9%) and Replica360 dataset (PSNR~8%, SSIM~2%
LPIPS~18%).
Related papers
- CUBE360: Learning Cubic Field Representation for Monocular 360 Depth Estimation for Virtual Reality [32.023283261191104]
CUBE360 learns a cubic field composed of multiple MPIs from a single panoramic image for depth estimation at any view direction.
Experiments on both synthetic and real-world datasets demonstrate the superior performance of CUBE360 compared to prior SSL methods.
arXiv Detail & Related papers (2024-10-08T06:52:46Z) - DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features [65.8738034806085]
DistillNeRF is a self-supervised learning framework for understanding 3D environments in autonomous driving scenes.
Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs.
arXiv Detail & Related papers (2024-06-17T21:15:13Z) - ReconFusion: 3D Reconstruction with Diffusion Priors [104.73604630145847]
We present ReconFusion to reconstruct real-world scenes using only a few photos.
Our approach leverages a diffusion prior for novel view synthesis, trained on synthetic and multiview datasets.
Our method synthesizes realistic geometry and texture in underconstrained regions while preserving the appearance of observed regions.
arXiv Detail & Related papers (2023-12-05T18:59:58Z) - rpcPRF: Generalizable MPI Neural Radiance Field for Satellite Camera [0.76146285961466]
This paper presents rpcPRF, a Multiplane Images (MPI) based Planar neural Radiance Field for Rational Polynomial Camera (RPC)
We propose to use reprojection supervision to induce the predicted MPI to learn the correct geometry between the 3D coordinates and the images.
We remove the stringent requirement of dense depth supervision from deep multiview-stereo-based methods by introducing rendering techniques of radiance fields.
arXiv Detail & Related papers (2023-10-11T04:05:11Z) - ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for
Sparse View Synthesis [99.06490355990354]
We propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels.
Our approach can considerably enhance model performance in sparse view conditions, achieving improvements of up to 94% in PSNR, in SSIM, and 31% in LPIPS.
arXiv Detail & Related papers (2023-05-18T15:18:01Z) - Learning Neural Duplex Radiance Fields for Real-Time View Synthesis [33.54507228895688]
We propose a novel approach to distill and bake NeRFs into highly efficient mesh-based neural representations.
We demonstrate the effectiveness and superiority of our approach via extensive experiments on a range of standard datasets.
arXiv Detail & Related papers (2023-04-20T17:59:52Z) - CompNVS: Novel View Synthesis with Scene Completion [83.19663671794596]
We propose a generative pipeline performing on a sparse grid-based neural scene representation to complete unobserved scene parts.
We process encoded image features in 3D space with a geometry completion network and a subsequent texture inpainting network to extrapolate the missing area.
Photorealistic image sequences can be finally obtained via consistency-relevant differentiable rendering.
arXiv Detail & Related papers (2022-07-23T09:03:13Z) - Enhancement of Novel View Synthesis Using Omnidirectional Image
Completion [61.78187618370681]
We present a method for synthesizing novel views from a single 360-degree RGB-D image based on the neural radiance field (NeRF)
Experiments demonstrated that the proposed method can synthesize plausible novel views while preserving the features of the scene for both artificial and real-world data.
arXiv Detail & Related papers (2022-03-18T13:49:25Z) - Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis [86.38901313994734]
We present DietNeRF, a 3D neural scene representation estimated from a few images.
NeRF learns a continuous volumetric representation of a scene through multi-view consistency.
We introduce an auxiliary semantic consistency loss that encourages realistic renderings at novel poses.
arXiv Detail & Related papers (2021-04-01T17:59:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.