InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds
- URL: http://arxiv.org/abs/2403.20309v3
- Date: Tue, 20 Aug 2024 20:57:47 GMT
- Title: InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds
- Authors: Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang,
- Abstract summary: Novel view synthesis (NVS) from a sparse set of images has advanced significantly in 3D computer vision.
It relies on precise initial estimation of camera parameters using Structure-from-Motion (SfM)
In this study, we introduce a novel and efficient framework to enhance robust NVS from sparse-view images.
- Score: 91.77050739918037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While novel view synthesis (NVS) from a sparse set of images has advanced significantly in 3D computer vision, it relies on precise initial estimation of camera parameters using Structure-from-Motion (SfM). For instance, the recently developed Gaussian Splatting depends heavily on the accuracy of SfM-derived points and poses. However, SfM processes are time-consuming and often prove unreliable in sparse-view scenarios, where matched features are scarce, leading to accumulated errors and limited generalization capability across datasets. In this study, we introduce a novel and efficient framework to enhance robust NVS from sparse-view images. Our framework, InstantSplat, integrates multi-view stereo(MVS) predictions with point-based representations to construct 3D Gaussians of large-scale scenes from sparse-view data within seconds, addressing the aforementioned performance and efficiency issues by SfM. Specifically, InstantSplat generates densely populated surface points across all training views and determines the initial camera parameters using pixel-alignment. Nonetheless, the MVS points are not globally accurate, and the pixel-wise prediction from all views results in an excessive Gaussian number, yielding a overparameterized scene representation that compromises both training speed and accuracy. To address this issue, we employ a grid-based, confidence-aware Farthest Point Sampling to strategically position point primitives at representative locations in parallel. Next, we enhance pose accuracy and tune scene parameters through a gradient-based joint optimization framework from self-supervision. By employing this simplified framework, InstantSplat achieves a substantial reduction in training time, from hours to mere seconds, and demonstrates robust performance across various numbers of views in diverse datasets.
Related papers
- GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views [67.34073368933814]
We propose a generalizable Gaussian Splatting approach for high-resolution image rendering under a sparse-view camera setting.
We train our Gaussian parameter regression module on human-only data or human-scene data, jointly with a depth estimation module to lift 2D parameter maps to 3D space.
Experiments on several datasets demonstrate that our method outperforms state-of-the-art methods while achieving an exceeding rendering speed.
arXiv Detail & Related papers (2024-11-18T08:18:44Z) - LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images [7.363332481155945]
This paper presents a vision-based localization pipeline utilizing the 3D Splatting (GS) technique as scene representation.
During the mapping phase, structure-from-motion (SfM) is applied first, followed by the generation of a GS map.
High-precision pose is achieved through the analysis-by manner on the map.
arXiv Detail & Related papers (2024-10-15T11:17:18Z) - MCGS: Multiview Consistency Enhancement for Sparse-View 3D Gaussian Radiance Fields [73.49548565633123]
Radiance fields represented by 3D Gaussians excel at synthesizing novel views, offering both high training efficiency and fast rendering.
Existing methods often incorporate depth priors from dense estimation networks but overlook the inherent multi-view consistency in input images.
We propose a view framework based on 3D Gaussian Splatting, named MCGS, enabling scene reconstruction from sparse input views.
arXiv Detail & Related papers (2024-10-15T08:39:05Z) - LoopSparseGS: Loop Based Sparse-View Friendly Gaussian Splatting [18.682864169561498]
LoopSparseGS is a loop-based 3DGS framework for the sparse novel view synthesis task.
We introduce a novel Sparse-friendly Sampling (SFS) strategy to handle oversized Gaussian ellipsoids leading to large pixel errors.
Experiments on four datasets demonstrate that LoopSparseGS outperforms existing state-of-the-art methods for sparse-input novel view synthesis.
arXiv Detail & Related papers (2024-08-01T03:26:50Z) - MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo [54.00987996368157]
We present MVSGaussian, a new generalizable 3D Gaussian representation approach derived from Multi-View Stereo (MVS)
MVSGaussian achieves real-time rendering with better synthesis quality for each scene.
arXiv Detail & Related papers (2024-05-20T17:59:30Z) - MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images [102.7646120414055]
We introduce MVSplat, an efficient model that, given sparse multi-view images as input, predicts clean feed-forward 3D Gaussians.
On the large-scale RealEstate10K and ACID benchmarks, MVSplat achieves state-of-the-art performance with the fastest feed-forward inference speed (22fps)
arXiv Detail & Related papers (2024-03-21T17:59:58Z) - FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting [58.41056963451056]
We propose a few-shot view synthesis framework based on 3D Gaussian Splatting.
This framework enables real-time and photo-realistic view synthesis with as few as three training views.
FSGS achieves state-of-the-art performance in both accuracy and rendering efficiency across diverse datasets.
arXiv Detail & Related papers (2023-12-01T09:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.