Learning Robust Multi-Scale Representation for Neural Radiance Fields
from Unposed Images
- URL: http://arxiv.org/abs/2311.04521v1
- Date: Wed, 8 Nov 2023 08:18:23 GMT
- Title: Learning Robust Multi-Scale Representation for Neural Radiance Fields
from Unposed Images
- Authors: Nishant Jain, Suryansh Kumar, Luc Van Gool
- Abstract summary: We present an improved solution to the neural image-based rendering problem in computer vision.
The proposed approach could synthesize a realistic image of the scene from a novel viewpoint at test time.
- Score: 65.41966114373373
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce an improved solution to the neural image-based rendering problem
in computer vision. Given a set of images taken from a freely moving camera at
train time, the proposed approach could synthesize a realistic image of the
scene from a novel viewpoint at test time. The key ideas presented in this
paper are (i) Recovering accurate camera parameters via a robust pipeline from
unposed day-to-day images is equally crucial in neural novel view synthesis
problem; (ii) It is rather more practical to model object's content at
different resolutions since dramatic camera motion is highly likely in
day-to-day unposed images. To incorporate the key ideas, we leverage the
fundamentals of scene rigidity, multi-scale neural scene representation, and
single-image depth prediction. Concretely, the proposed approach makes the
camera parameters as learnable in a neural fields-based modeling framework. By
assuming per view depth prediction is given up to scale, we constrain the
relative pose between successive frames. From the relative poses, absolute
camera pose estimation is modeled via a graph-neural network-based multiple
motion averaging within the multi-scale neural-fields network, leading to a
single loss function. Optimizing the introduced loss function provides camera
intrinsic, extrinsic, and image rendering from unposed images. We demonstrate,
with examples, that for a unified framework to accurately model multiscale
neural scene representation from day-to-day acquired unposed multi-view images,
it is equally essential to have precise camera-pose estimates within the scene
representation framework. Without considering robustness measures in the camera
pose estimation pipeline, modeling for multi-scale aliasing artifacts can be
counterproductive. We present extensive experiments on several benchmark
datasets to demonstrate the suitability of our approach.
Related papers
- RANRAC: Robust Neural Scene Representations via Random Ray Consensus [12.161889666145127]
RANdom RAy Consensus (RANRAC) is an efficient approach to eliminate the effect of inconsistent data.
We formulate a fuzzy adaption of the RANSAC paradigm, enabling its application to large scale models.
Results indicate significant improvements compared to state-of-the-art robust methods for novel-view synthesis.
arXiv Detail & Related papers (2023-12-15T13:33:09Z) - Inverting the Imaging Process by Learning an Implicit Camera Model [73.81635386829846]
This paper proposes a novel implicit camera model which represents the physical imaging process of a camera as a deep neural network.
We demonstrate the power of this new implicit camera model on two inverse imaging tasks.
arXiv Detail & Related papers (2023-04-25T11:55:03Z) - MELON: NeRF with Unposed Images in SO(3) [35.093700416540436]
We show that a neural network can reconstruct a neural radiance field from unposed images with state-of-the-art accuracy while requiring ten times fewer views than adversarial approaches.
Using a neural-network to regularize pose estimation, we demonstrate that our method can reconstruct a neural radiance field from unposed images with state-of-the-art accuracy while requiring ten times fewer views than adversarial approaches.
arXiv Detail & Related papers (2023-03-14T17:33:39Z) - Robustifying the Multi-Scale Representation of Neural Radiance Fields [86.69338893753886]
We present a robust multi-scale neural radiance fields representation approach to overcome both real-world imaging issues.
Our method handles multi-scale imaging effects and camera-pose estimation problems with NeRF-inspired approaches.
We demonstrate, with examples, that for an accurate neural representation of an object from day-to-day acquired multi-view images, it is crucial to have precise camera-pose estimates.
arXiv Detail & Related papers (2022-10-09T11:46:45Z) - im2nerf: Image to Neural Radiance Field in the Wild [47.18702901448768]
im2nerf is a learning framework that predicts a continuous neural object representation given a single input image in the wild.
We show that im2nerf achieves the state-of-the-art performance for novel view synthesis from a single-view unposed image in the wild.
arXiv Detail & Related papers (2022-09-08T23:28:56Z) - RelPose: Predicting Probabilistic Relative Rotation for Single Objects
in the Wild [73.1276968007689]
We describe a data-driven method for inferring the camera viewpoints given multiple images of an arbitrary object.
We show that our approach outperforms state-of-the-art SfM and SLAM methods given sparse images on both seen and unseen categories.
arXiv Detail & Related papers (2022-08-11T17:59:59Z) - DeepMultiCap: Performance Capture of Multiple Characters Using Sparse
Multiview Cameras [63.186486240525554]
DeepMultiCap is a novel method for multi-person performance capture using sparse multi-view cameras.
Our method can capture time varying surface details without the need of using pre-scanned template models.
arXiv Detail & Related papers (2021-05-01T14:32:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.