Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
- URL: http://arxiv.org/abs/2104.00677v1
- Date: Thu, 1 Apr 2021 17:59:31 GMT
- Title: Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
- Authors: Ajay Jain and Matthew Tancik and Pieter Abbeel
- Abstract summary: We present DietNeRF, a 3D neural scene representation estimated from a few images.
NeRF learns a continuous volumetric representation of a scene through multi-view consistency.
We introduce an auxiliary semantic consistency loss that encourages realistic renderings at novel poses.
- Score: 86.38901313994734
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We present DietNeRF, a 3D neural scene representation estimated from a few
images. Neural Radiance Fields (NeRF) learn a continuous volumetric
representation of a scene through multi-view consistency, and can be rendered
from novel viewpoints by ray casting. While NeRF has an impressive ability to
reconstruct geometry and fine details given many images, up to 100 for
challenging 360{\deg} scenes, it often finds a degenerate solution to its image
reconstruction objective when only a few input views are available. To improve
few-shot quality, we propose DietNeRF. We introduce an auxiliary semantic
consistency loss that encourages realistic renderings at novel poses. DietNeRF
is trained on individual scenes to (1) correctly render given input views from
the same pose, and (2) match high-level semantic attributes across different,
random poses. Our semantic loss allows us to supervise DietNeRF from arbitrary
poses. We extract these semantics using a pre-trained visual encoder such as
CLIP, a Vision Transformer trained on hundreds of millions of diverse
single-view, 2D photographs mined from the web with natural language
supervision. In experiments, DietNeRF improves the perceptual quality of
few-shot view synthesis when learned from scratch, can render novel views with
as few as one observed image when pre-trained on a multi-view dataset, and
produces plausible completions of completely unobserved regions.
Related papers
- DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features [65.8738034806085]
DistillNeRF is a self-supervised learning framework for understanding 3D environments in autonomous driving scenes.
Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs.
arXiv Detail & Related papers (2024-06-17T21:15:13Z) - SPARF: Neural Radiance Fields from Sparse and Noisy Poses [58.528358231885846]
We introduce Sparse Pose Adjusting Radiance Field (SPARF) to address the challenge of novel-view synthesis.
Our approach exploits multi-view geometry constraints in order to jointly learn the NeRF and refine the camera poses.
arXiv Detail & Related papers (2022-11-21T18:57:47Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Ray Priors through Reprojection: Improving Neural Radiance Fields for
Novel View Extrapolation [35.47411859184933]
We study the novel view extrapolation setting that (1) the training images can well describe an object, and (2) there is a notable discrepancy between the training and test viewpoints' distributions.
We propose a random ray casting policy that allows training unseen views using seen views.
A ray atlas pre-computed from the observed rays' viewing directions could further enhance the rendering quality for extrapolated views.
arXiv Detail & Related papers (2022-05-12T07:21:17Z) - Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual
Fly-Throughs [54.41204057689033]
We explore how to leverage neural fields (NeRFs) to build interactive 3D environments from large-scale visual captures spanning buildings or even multiple city blocks collected primarily from drone data.
In contrast to the single object scenes against which NeRFs have been traditionally evaluated, this setting poses multiple challenges.
We introduce a simple clustering algorithm that partitions training images (or rather pixels) into different NeRF submodules that can be trained in parallel.
arXiv Detail & Related papers (2021-12-20T17:40:48Z) - Baking Neural Radiance Fields for Real-Time View Synthesis [41.07052395570522]
We present a method to train a NeRF, then precompute and store (i.e. "bake") it as a novel representation called a Sparse Neural Radiance Grid (SNeRG)
The resulting scene representation retains NeRF's ability to render fine geometric details and view-dependent appearance, is compact, and can be rendered in real-time.
arXiv Detail & Related papers (2021-03-26T17:59:52Z) - pixelNeRF: Neural Radiance Fields from One or Few Images [20.607712035278315]
pixelNeRF is a learning framework that predicts a continuous neural scene representation conditioned on one or few input images.
We conduct experiments on ShapeNet benchmarks for single image novel view synthesis tasks with held-out objects.
In all cases, pixelNeRF outperforms current state-of-the-art baselines for novel view synthesis and single image 3D reconstruction.
arXiv Detail & Related papers (2020-12-03T18:59:54Z) - D-NeRF: Neural Radiance Fields for Dynamic Scenes [72.75686949608624]
We introduce D-NeRF, a method that extends neural radiance fields to a dynamic domain.
D-NeRF reconstructs images of objects under rigid and non-rigid motions from a camera moving around the scene.
We demonstrate the effectiveness of our approach on scenes with objects under rigid, articulated and non-rigid motions.
arXiv Detail & Related papers (2020-11-27T19:06:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.