Related papers: ProbNeRF: Uncertainty-Aware Inference of 3D Shapes from 2D Images

ProbNeRF: Uncertainty-Aware Inference of 3D Shapes from 2D Images

URL: http://arxiv.org/abs/2210.17415v1
Date: Thu, 27 Oct 2022 22:35:24 GMT
Title: ProbNeRF: Uncertainty-Aware Inference of 3D Shapes from 2D Images
Authors: Matthew D. Hoffman, Tuan Anh Le, Pavel Sountsov, Christopher Suter, Ben Lee, Vikash K. Mansinghka, Rif A. Saurous
Abstract summary: conditional neural radiance field (NeRF) models can learn to infer good point estimates of 3D models from single 2D images. ProbNeRF is trained as a variational autoencoder, but at test time we use Hamiltonian Monte Carlo (HMC) for inference. We show that key to the success of ProbNeRF are (i) a deterministic rendering scheme, (ii) an annealed-HMC strategy, (iii) a hypernetwork-based decoder architecture, and (iv) doing inference over a full set of NeRF weights.
Score: 19.423108873761972
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The problem of inferring object shape from a single 2D image is underconstrained. Prior knowledge about what objects are plausible can help, but even given such prior knowledge there may still be uncertainty about the shapes of occluded parts of objects. Recently, conditional neural radiance field (NeRF) models have been developed that can learn to infer good point estimates of 3D models from single 2D images. The problem of inferring uncertainty estimates for these models has received less attention. In this work, we propose probabilistic NeRF (ProbNeRF), a model and inference strategy for learning probabilistic generative models of 3D objects' shapes and appearances, and for doing posterior inference to recover those properties from 2D images. ProbNeRF is trained as a variational autoencoder, but at test time we use Hamiltonian Monte Carlo (HMC) for inference. Given one or a few 2D images of an object (which may be partially occluded), ProbNeRF is able not only to accurately model the parts it sees, but also to propose realistic and diverse hypotheses about the parts it does not see. We show that key to the success of ProbNeRF are (i) a deterministic rendering scheme, (ii) an annealed-HMC strategy, (iii) a hypernetwork-based decoder architecture, and (iv) doing inference over a full set of NeRF weights, rather than just a low-dimensional code.

Related papers

DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features [65.8738034806085]
DistillNeRF is a self-supervised learning framework for understanding 3D environments in autonomous driving scenes. Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs.
arXiv Detail & Related papers (2024-06-17T21:15:13Z)
Probing the 3D Awareness of Visual Foundation Models [56.68380136809413]
We analyze the 3D awareness of visual foundation models. We conduct experiments using task-specific probes and zero-shot inference procedures on frozen features.
arXiv Detail & Related papers (2024-04-12T17:58:04Z)
MultiPlaneNeRF: Neural Radiance Field with Non-Trainable Representation [11.049528513775968]
NeRF is a popular model that efficiently represents 3D objects from 2D images. We present MultiPlaneNeRF -- a model that simultaneously solves the above problems.
arXiv Detail & Related papers (2023-05-17T21:27:27Z)
MoDA: Modeling Deformable 3D Objects from Casual Videos [84.29654142118018]
We propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation without skin-collapsing artifacts. In the endeavor to register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space. Our approach can reconstruct 3D models for humans and animals with better qualitative and quantitative performance than state-of-the-art methods.
arXiv Detail & Related papers (2023-04-17T13:49:04Z)
Likelihood-Based Generative Radiance Field with Latent Space Energy-Based Model for 3D-Aware Disentangled Image Representation [43.41596483002523]
We propose a likelihood-based top-down 3D-aware 2D image generative model that incorporates 3D representation via Neural Radiance Fields (NeRF) and 2D imaging process via differentiable volume rendering. Experiments on several benchmark datasets demonstrate that the NeRF-LEBM can infer 3D object structures from 2D images, generate 2D images with novel views and objects, learn from incomplete 2D images, and learn from 2D images with known or unknown camera poses.
arXiv Detail & Related papers (2023-04-16T23:44:41Z)
HoloDiffusion: Training a 3D Diffusion Model using 2D Images [71.1144397510333]
We introduce a new diffusion setup that can be trained, end-to-end, with only posed 2D images for supervision. We show that our diffusion models are scalable, train robustly, and are competitive in terms of sample quality and fidelity to existing approaches for 3D generative modeling.
arXiv Detail & Related papers (2023-03-29T07:35:56Z)
FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models [21.523836478458524]
Recent works on generalizable NeRFs have shown promising results on novel view synthesis from single or few images. We propose a novel framework named FeatureNeRF to learn generalizable NeRFs by distilling pre-trained vision models. Our experiments demonstrate the effectiveness of FeatureNeRF as a generalizable 3D semantic feature extractor.
arXiv Detail & Related papers (2023-03-22T17:57:01Z)
NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion [107.67277084886929]
Novel view synthesis from a single image requires inferring occluded regions of objects and scenes whilst simultaneously maintaining semantic and physical consistency with the input. We propose NerfDiff, which addresses this issue by distilling the knowledge of a 3D-aware conditional diffusion model (CDM) into NeRF through synthesizing and refining a set of virtual views at test time. We further propose a novel NeRF-guided distillation algorithm that simultaneously generates 3D consistent virtual views from the CDM samples, and finetunes the NeRF based on the improved virtual views.
arXiv Detail & Related papers (2023-02-20T17:12:00Z)
Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian [58.704089101826774]
We present a 3D-aware image deformation method with minimal restrictions on shape category and deformation type. We take a supervised learning-based approach to predict the shape Laplacian of the underlying volume of a 3D reconstruction represented as a point cloud. In the experiments, we present our results of deforming 2D character and clothed human images.
arXiv Detail & Related papers (2022-03-29T04:57:18Z)
NeRF-Pose: A First-Reconstruct-Then-Regress Approach for Weakly-supervised 6D Object Pose Estimation [44.42449011619408]
We present a weakly-supervised reconstruction-based pipeline, named NeRF-Pose, which needs only 2D object segmentation and known relative camera poses during training. A NeRF-enabled RAN+SAC algorithm is used to estimate stable and accurate pose from the predicted correspondences. Experiments on LineMod-Occlusion show that the proposed method has state-of-the-art accuracy in comparison to the best 6D pose estimation methods.
arXiv Detail & Related papers (2022-03-09T15:28:02Z)
FiG-NeRF: Figure-Ground Neural Radiance Fields for 3D Object Category Modelling [11.432178728985956]
We use Neural Radiance Fields (NeRF) to learn high quality 3D object category models from collections of input images. We show that this method can learn accurate 3D object category models using only photometric supervision and casually captured images.
arXiv Detail & Related papers (2021-04-17T01:38:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.