Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis
- URL: http://arxiv.org/abs/2411.00144v2
- Date: Fri, 22 Nov 2024 10:39:59 GMT
- Title: Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis
- Authors: Chen Zhao, Xuan Wang, Tong Zhang, Saqib Javed, Mathieu Salzmann,
- Abstract summary: 3D Gaussian Splatting (3DGS) has demonstrated remarkable effectiveness for novel view synthesis (NVS)
However, the 3DGS model tends to overfit when trained with sparse posed views, limiting its generalization ability to novel views.
We present a Self-Ensembling Gaussian Splatting (SE-GS) approach to alleviate the overfitting problem.
Our approach improves NVS quality with few-shot training views, outperforming existing state-of-the-art methods.
- Score: 55.561961365113554
- License:
- Abstract: 3D Gaussian Splatting (3DGS) has demonstrated remarkable effectiveness for novel view synthesis (NVS). However, the 3DGS model tends to overfit when trained with sparse posed views, limiting its generalization ability to novel views. In this paper, we alleviate the overfitting problem, presenting a Self-Ensembling Gaussian Splatting (SE-GS) approach. Our method encompasses a $\mathbf{\Sigma}$-model and a $\mathbf{\Delta}$-model. The $\mathbf{\Sigma}$-model serves as an ensemble of 3DGS models that generates novel-view images during inference. We achieve the self-ensembling by introducing an uncertainty-aware perturbation strategy at the training state. We complement the $\mathbf{\Sigma}$-model with the $\mathbf{\Delta}$-model, which is dynamically perturbed based on the uncertainties of novel-view renderings across different training steps. The perturbation yields diverse temporal samples in the Gaussian parameter space without additional training costs. The geometry of the $\mathbf{\Sigma}$-model is regularized by penalizing discrepancies between the $\mathbf{\Sigma}$-model and these temporal samples. Therefore, our SE-GS conducts an effective and efficient regularization across a large number of 3DGS models, resulting in a robust ensemble, the $\mathbf{\Sigma}$-model. Our experimental results on the LLFF, Mip-NeRF360, DTU, and MVImgNet datasets show that our approach improves NVS quality with few-shot training views, outperforming existing state-of-the-art methods. The code is released at: https://sailor-z.github.io/projects/SEGS.html.
Related papers
- NovelGS: Consistent Novel-view Denoising via Large Gaussian Reconstruction Model [57.92709692193132]
NovelGS is a diffusion model for Gaussian Splatting given sparse-view images.
We leverage the novel view denoising through a transformer-based network to generate 3D Gaussians.
arXiv Detail & Related papers (2024-11-25T07:57:17Z) - GaussianSpa: An "Optimizing-Sparsifying" Simplification Framework for Compact and High-Quality 3D Gaussian Splatting [12.342660713851227]
3D Gaussian Splatting (3DGS) has emerged as a mainstream for novel view synthesis, leveraging continuous aggregations of Gaussian functions.
3DGS suffers from substantial memory requirements to store the multitude of Gaussians, hindering its practicality.
We introduce GaussianSpa, an optimization-based simplification framework for compact and high-quality 3DGS.
arXiv Detail & Related papers (2024-11-09T00:38:06Z) - No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - Near-Optimal Streaming Heavy-Tailed Statistical Estimation with Clipped SGD [16.019880089338383]
We show that Clipped-SGD, for smooth and strongly convex objectives, achieves an error of $sqrtfracmathsfTr(Sigma)+sqrtmathsfTr(Sigma)+sqrtmathsfTr(Sigma)+sqrtmathsfTr(Sigma)+sqrtmathsfTr(Sigma)+sqrtmathsfTr(Sigma)+sqrtmathsf
arXiv Detail & Related papers (2024-10-26T10:14:17Z) - MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views [27.47491233656671]
Novel View Synthesis (NVS) is a significant challenge in 3D vision applications.
We propose textbfMVPGS, a few-shot NVS method that excavates the multi-view priors based on 3D Gaussian Splatting.
Experiments show that the proposed method achieves state-of-the-art performance with real-time rendering speed.
arXiv Detail & Related papers (2024-09-22T05:07:20Z) - Dynamic angular synchronization under smoothness constraints [9.196539011582361]
We find non-asymptotic recovery guarantees for the mean-squared error (MSE) under different statistical models.
We show that the MSE converges to zero as $T$ increases under milder conditions than in the static setting.
arXiv Detail & Related papers (2024-06-06T13:36:41Z) - HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression [55.6351304553003]
3D Gaussian Splatting (3DGS) has emerged as a promising framework for novel view synthesis.
We propose a Hash-grid Assisted Context (HAC) framework for highly compact 3DGS representation.
Our work is the pioneer to explore context-based compression for 3DGS representation, resulting in a remarkable size reduction of over $75times$ compared to vanilla 3DGS.
arXiv Detail & Related papers (2024-03-21T16:28:58Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z) - Using Deep Learning to Improve Ensemble Smoother: Applications to
Subsurface Characterization [2.4373900721120285]
Ensemble smoother (ES) has been widely used in various research fields.
ES$_text(DL)$ is an update scheme for ES in complex data assimilation applications.
We show that the DL-based ES method, that is, ES$_text(DL)$, is more general and flexible.
arXiv Detail & Related papers (2020-02-21T02:46:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.