Aug3D: Augmenting large scale outdoor datasets for Generalizable Novel View Synthesis
- URL: http://arxiv.org/abs/2501.06431v1
- Date: Sat, 11 Jan 2025 04:13:26 GMT
- Title: Aug3D: Augmenting large scale outdoor datasets for Generalizable Novel View Synthesis
- Authors: Aditya Rauniyar, Omar Alama, Silong Yong, Katia Sycara, Sebastian Scherer,
- Abstract summary: We train PixelNeRF, a feed-forward NVS model, on the large-scale UrbanScene3D dataset.
Aug3D generates well-conditioned novel views through grid and semantic sampling to enhance feed-forward NVS model learning.
Our experiments reveal that reducing the number of views per cluster from 20 to 10 improves PSNR by 10%, but the performance remains suboptimal.
- Score: 1.2420608329006513
- License:
- Abstract: Recent photorealistic Novel View Synthesis (NVS) advances have increasingly gained attention. However, these approaches remain constrained to small indoor scenes. While optimization-based NVS models have attempted to address this, generalizable feed-forward methods, offering significant advantages, remain underexplored. In this work, we train PixelNeRF, a feed-forward NVS model, on the large-scale UrbanScene3D dataset. We propose four training strategies to cluster and train on this dataset, highlighting that performance is hindered by limited view overlap. To address this, we introduce Aug3D, an augmentation technique that leverages reconstructed scenes using traditional Structure-from-Motion (SfM). Aug3D generates well-conditioned novel views through grid and semantic sampling to enhance feed-forward NVS model learning. Our experiments reveal that reducing the number of views per cluster from 20 to 10 improves PSNR by 10%, but the performance remains suboptimal. Aug3D further addresses this by combining the newly generated novel views with the original dataset, demonstrating its effectiveness in improving the model's ability to predict novel views.
Related papers
- See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic Regularization [14.239772421978373]
3D Gaussian Splatting (3DGS) has shown remarkable performance in novel view synthesis.
However, its rendering quality deteriorates with sparse inphut views, leading to distorted content and reduced details.
We propose a sparse-view 3DGS method, incorporating prior information is crucial.
Our method outperforms state-of-the-art novel view synthesis approaches, achieving up to 0.4dB improvement in terms of PSNR on the LLFF dataset.
arXiv Detail & Related papers (2025-01-20T14:30:38Z) - Novel View Synthesis with Pixel-Space Diffusion Models [4.844800099745365]
generative models are being increasingly employed in novel view synthesis (NVS)
We adapt a modern diffusion model architecture for end-to-end NVS in the pixel space.
We introduce a novel NVS training scheme that utilizes single-view datasets, capitalizing on their relative abundance.
arXiv Detail & Related papers (2024-11-12T12:58:33Z) - ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis [63.169364481672915]
We propose textbfViewCrafter, a novel method for synthesizing high-fidelity novel views of generic scenes from single or sparse images.
Our method takes advantage of the powerful generation capabilities of video diffusion model and the coarse 3D clues offered by point-based representation to generate high-quality video frames.
arXiv Detail & Related papers (2024-09-03T16:53:19Z) - Efficient Depth-Guided Urban View Synthesis [52.841803876653465]
We introduce Efficient Depth-Guided Urban View Synthesis (EDUS) for fast feed-forward inference and efficient per-scene fine-tuning.
EDUS exploits noisy predicted geometric priors as guidance to enable generalizable urban view synthesis from sparse input images.
Our results indicate that EDUS achieves state-of-the-art performance in sparse view settings when combined with fast test-time optimization.
arXiv Detail & Related papers (2024-07-17T08:16:25Z) - Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation [66.3814684757376]
This work presents Zero123-6D, the first work to demonstrate the utility of Diffusion Model-based novel-view-synthesizers in enhancing RGB 6D pose estimation at category-level.
The outlined method shows reduction in data requirements, removal of the necessity of depth information in zero-shot category-level 6D pose estimation task, and increased performance, quantitatively demonstrated through experiments on the CO3D dataset.
arXiv Detail & Related papers (2024-03-21T10:38:18Z) - PNeRFLoc: Visual Localization with Point-based Neural Radiance Fields [54.8553158441296]
We propose a novel visual localization framework, ie, PNeRFLoc, based on a unified point-based representation.
On the one hand, PNeRFLoc supports the initial pose estimation by matching 2D and 3D feature points.
On the other hand, it also enables pose refinement with novel view synthesis using rendering-based optimization.
arXiv Detail & Related papers (2023-12-17T08:30:00Z) - Re-Nerfing: Improving Novel View Synthesis through Novel View Synthesis [80.3686833921072]
Recent neural rendering and reconstruction techniques, such as NeRFs or Gaussian Splatting, have shown remarkable novel view synthesis capabilities.
With fewer images available, these methods start to fail since they can no longer correctly triangulate the underlying 3D geometry.
We propose Re-Nerfing, a simple and general add-on approach that leverages novel view synthesis itself to tackle this problem.
arXiv Detail & Related papers (2023-12-04T18:56:08Z) - Urban Radiance Fields [77.43604458481637]
We perform 3D reconstruction and novel view synthesis from data captured by scanning platforms commonly deployed for world mapping in urban outdoor environments.
Our approach extends Neural Radiance Fields, which has been demonstrated to synthesize realistic novel images for small scenes in controlled settings.
Each of these three extensions provides significant performance improvements in experiments on Street View data.
arXiv Detail & Related papers (2021-11-29T15:58:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.