Related papers: NeVStereo: A NeRF-Driven NVS-Stereo Architecture for High-Fidelity 3D Tasks

NeVStereo: A NeRF-Driven NVS-Stereo Architecture for High-Fidelity 3D Tasks

URL: http://arxiv.org/abs/2602.05423v1
Date: Thu, 05 Feb 2026 08:15:06 GMT
Title: NeVStereo: A NeRF-Driven NVS-Stereo Architecture for High-Fidelity 3D Tasks
Authors: Pengcheng Chen, Yue Hu, Wenhao Li, Nicole M Gunderson, Andrew Feng, Zhenglong Sun, Peter Beerel, Eric J Seibel,
Abstract summary: We present NeVStereo, a NeRF-driven NVS-stereo architecture that aims to jointly deliver camera poses, multi-view depth, novel view synthesis, and surface reconstruction from RGB-only inputs.<n>NeVStereo achieves consistently strong zero-shot performance, with up to 36% lower depth error, 10.4% improved pose accuracy, 4.5% higher NVS fidelity, and state-of-the-art mesh quality.
Score: 14.861893846625193
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In modern dense 3D reconstruction, feed-forward systems (e.g., VGGT, pi3) focus on end-to-end matching and geometry prediction but do not explicitly output the novel view synthesis (NVS). Neural rendering-based approaches offer high-fidelity NVS and detailed geometry from posed images, yet they typically assume fixed camera poses and can be sensitive to pose errors. As a result, it remains non-trivial to obtain a single framework that can offer accurate poses, reliable depth, high-quality rendering, and accurate 3D surfaces from casually captured views. We present NeVStereo, a NeRF-driven NVS-stereo architecture that aims to jointly deliver camera poses, multi-view depth, novel view synthesis, and surface reconstruction from multi-view RGB-only inputs. NeVStereo combines NeRF-based NVS for stereo-friendly renderings, confidence-guided multi-view depth estimation, NeRF-coupled bundle adjustment for pose refinement, and an iterative refinement stage that updates both depth and the radiance field to improve geometric consistency. This design mitigated the common NeRF-based issues such as surface stacking, artifacts, and pose-depth coupling. Across indoor, outdoor, tabletop, and aerial benchmarks, our experiments indicate that NeVStereo achieves consistently strong zero-shot performance, with up to 36% lower depth error, 10.4% improved pose accuracy, 4.5% higher NVS fidelity, and state-of-the-art mesh quality (F1 91.93%, Chamfer 4.35 mm) compared to existing prestigious methods.

Related papers

FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction [69.63414788486578]
FreeSplatter is a scalable feed-forward framework that generates high-quality 3D Gaussians from uncalibrated sparse-view images.<n>Our approach employs a streamlined transformer architecture where self-attention blocks facilitate information exchange.<n>We develop two specialized variants--for object-centric and scene-level reconstruction--trained on comprehensive datasets.
arXiv Detail & Related papers (2024-12-12T18:52:53Z)
Towards Degradation-Robust Reconstruction in Generalizable NeRF [58.33351079982745]
Generalizable Radiance Field (GNeRF) across scenes has been proven to be an effective way to avoid per-scene optimization. There has been limited research on the robustness of GNeRFs to different types of degradation present in the source images.
arXiv Detail & Related papers (2024-11-18T16:13:47Z)
PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices.<n>Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z)
NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images [62.752710734332894]
NeRSP is a Neural 3D reconstruction technique for Reflective surfaces with Sparse Polarized images. We derive photometric and geometric cues from the polarimetric image formation model and multiview azimuth consistency. We achieve the state-of-the-art surface reconstruction results with only 6 views as input.
arXiv Detail & Related papers (2024-06-11T09:53:18Z)
NoPose-NeuS: Jointly Optimizing Camera Poses with Neural Implicit Surfaces for Multi-view Reconstruction [0.0]
NoPose-NeuS is a neural implicit surface reconstruction method that extends NeuS to jointly optimize camera poses with the geometry and color networks. We show that the proposed method can estimate relatively accurate camera poses, while maintaining a high surface reconstruction quality with 0.89 mean Chamfer distance.
arXiv Detail & Related papers (2023-12-23T12:18:22Z)
FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis [30.25904672829623]
We propose FlipNeRF, a novel regularization method for few-shot novel view synthesis by utilizing our proposed flipped reflection rays. FlipNeRF is able to estimate more reliable outputs with reducing floating artifacts effectively across the different scene structures.
arXiv Detail & Related papers (2023-06-30T15:11:00Z)
PlaNeRF: SVD Unsupervised 3D Plane Regularization for NeRF Large-Scale Scene Reconstruction [2.2369578015657954]
Neural Radiance Fields (NeRF) enable 3D scene reconstruction from 2D images and camera poses for Novel View Synthesis (NVS) NeRF often suffers from overfitting to training views, leading to poor geometry reconstruction. We propose a new method to improve NeRF's 3D structure using only RGB images and semantic maps.
arXiv Detail & Related papers (2023-05-26T13:26:46Z)
ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for Sparse View Synthesis [99.06490355990354]
We propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels. Our approach can considerably enhance model performance in sparse view conditions, achieving improvements of up to 94% in PSNR, in SSIM, and 31% in LPIPS.
arXiv Detail & Related papers (2023-05-18T15:18:01Z)
A Comparative Neural Radiance Field (NeRF) 3D Analysis of Camera Poses from HoloLens Trajectories and Structure from Motion [0.0]
We present a workflow for high-resolution 3D reconstructions almost directly from HoloLens data using Neural Radiance Fields (NeRFs) NeRFs are trained using a set of camera poses and associated images as input to estimate density and color values for each position. Results show that the internal camera poses lead to NeRF convergence with a PSNR of 25,dB with a simple rotation around the x-axis and enable a 3D reconstruction.
arXiv Detail & Related papers (2023-04-20T22:17:28Z)
D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry [57.5549733585324]
D3VO is a novel framework for monocular visual odometry that exploits deep networks on three levels -- deep depth, pose and uncertainty estimation. We first propose a novel self-supervised monocular depth estimation network trained on stereo videos without any external supervision. We model the photometric uncertainties of pixels on the input images, which improves the depth estimation accuracy.
arXiv Detail & Related papers (2020-03-02T17:47:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.