Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs
- URL: http://arxiv.org/abs/2503.05082v1
- Date: Fri, 07 Mar 2025 01:59:05 GMT
- Title: Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs
- Authors: Yingji Zhong, Zhihao Li, Dave Zhenyu Chen, Lanqing Hong, Dan Xu,
- Abstract summary: We propose a reconstruction by generation pipeline that leverages learned priors from video diffusion models to provide plausible interpretations for regions outside the field of view or occluded.<n>We introduce a novel scene-grounding guidance based on rendered sequences from an optimized 3DGS, which tames the diffusion model to generate consistent sequences.<n>Our method significantly improves upon the baseline and achieves state-of-the-art performance on challenging benchmarks.
- Score: 28.381287866505637
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite recent successes in novel view synthesis using 3D Gaussian Splatting (3DGS), modeling scenes with sparse inputs remains a challenge. In this work, we address two critical yet overlooked issues in real-world sparse-input modeling: extrapolation and occlusion. To tackle these issues, we propose to use a reconstruction by generation pipeline that leverages learned priors from video diffusion models to provide plausible interpretations for regions outside the field of view or occluded. However, the generated sequences exhibit inconsistencies that do not fully benefit subsequent 3DGS modeling. To address the challenge of inconsistencies, we introduce a novel scene-grounding guidance based on rendered sequences from an optimized 3DGS, which tames the diffusion model to generate consistent sequences. This guidance is training-free and does not require any fine-tuning of the diffusion model. To facilitate holistic scene modeling, we also propose a trajectory initialization method. It effectively identifies regions that are outside the field of view and occluded. We further design a scheme tailored for 3DGS optimization with generated sequences. Experiments demonstrate that our method significantly improves upon the baseline and achieves state-of-the-art performance on challenging benchmarks.
Related papers
- DropoutGS: Dropping Out Gaussians for Better Sparse-view Rendering [45.785618745095164]
3D Gaussian Splatting (3DGS) has demonstrated promising results in novel view synthesis.
As the number of training views decreases, the novel view synthesis task degrades to a highly under-determined problem.
We propose a Random Dropout Regularization (RDR) to exploit the advantages of low-complexity models to alleviate overfitting.
In addition, to remedy the lack of high-frequency details for these models, an Edge-guided Splitting Strategy (ESS) is developed.
arXiv Detail & Related papers (2025-04-13T09:17:21Z) - Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis [22.767866875051013]
We propose GS-Diff, a novel 3DGS framework guided by a multi-view diffusion model to address limitations of current methods.
By generating pseudo-observations conditioned on multi-view inputs, our method transforms under-constrained 3D reconstruction problems into well-posed ones.
Experiments on four benchmarks demonstrate that GS-Diff consistently outperforms state-of-the-art baselines by significant margins.
arXiv Detail & Related papers (2025-04-02T17:59:46Z) - EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.
We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - NovelGS: Consistent Novel-view Denoising via Large Gaussian Reconstruction Model [57.92709692193132]
NovelGS is a diffusion model for Gaussian Splatting given sparse-view images.
We leverage the novel view denoising through a transformer-based network to generate 3D Gaussians.
arXiv Detail & Related papers (2024-11-25T07:57:17Z) - No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors [34.91966359570867]
sparse-view reconstruction is inherently ill-posed and under-constrained.
We introduce LM-Gaussian, a method capable of generating high-quality reconstructions from a limited number of images.
Our approach significantly reduces the data acquisition requirements compared to previous 3DGS methods.
arXiv Detail & Related papers (2024-09-05T12:09:02Z) - GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior [53.52396082006044]
Current methods struggle to maintain rendering quality at the viewpoint that deviates significantly from the training viewpoints.
This issue stems from the sparse training views captured by a fixed camera on a moving vehicle.
We propose a novel approach that enhances the capacity of 3DGS by leveraging prior from a Diffusion Model.
arXiv Detail & Related papers (2024-03-29T09:20:29Z) - GaussianPro: 3D Gaussian Splatting with Progressive Propagation [49.918797726059545]
3DGS relies heavily on the point cloud produced by Structure-from-Motion (SfM) techniques.
We propose a novel method that applies a progressive propagation strategy to guide the densification of the 3D Gaussians.
Our method significantly surpasses 3DGS on the dataset, exhibiting an improvement of 1.15dB in terms of PSNR.
arXiv Detail & Related papers (2024-02-22T16:00:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.