ProSplat: Improved Feed-Forward 3D Gaussian Splatting for Wide-Baseline Sparse Views
- URL: http://arxiv.org/abs/2506.07670v1
- Date: Mon, 09 Jun 2025 11:45:50 GMT
- Title: ProSplat: Improved Feed-Forward 3D Gaussian Splatting for Wide-Baseline Sparse Views
- Authors: Xiaohan Lu, Jiaye Fu, Jiaqi Zhang, Zetian Song, Chuanmin Jia, Siwei Ma,
- Abstract summary: ProSplat is a two-stage feed-forward framework for high-fidelity rendering under wide-baseline conditions.<n>ProSplat achieves an average improvement of 1 dB in PSNR compared to recent SOTA methods.
- Score: 31.282881864317375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feed-forward 3D Gaussian Splatting (3DGS) has recently demonstrated promising results for novel view synthesis (NVS) from sparse input views, particularly under narrow-baseline conditions. However, its performance significantly degrades in wide-baseline scenarios due to limited texture details and geometric inconsistencies across views. To address these challenges, in this paper, we propose ProSplat, a two-stage feed-forward framework designed for high-fidelity rendering under wide-baseline conditions. The first stage involves generating 3D Gaussian primitives via a 3DGS generator. In the second stage, rendered views from these primitives are enhanced through an improvement model. Specifically, this improvement model is based on a one-step diffusion model, further optimized by our proposed Maximum Overlap Reference view Injection (MORI) and Distance-Weighted Epipolar Attention (DWEA). MORI supplements missing texture and color by strategically selecting a reference view with maximum viewpoint overlap, while DWEA enforces geometric consistency using epipolar constraints. Additionally, we introduce a divide-and-conquer training strategy that aligns data distributions between the two stages through joint optimization. We evaluate ProSplat on the RealEstate10K and DL3DV-10K datasets under wide-baseline settings. Experimental results demonstrate that ProSplat achieves an average improvement of 1 dB in PSNR compared to recent SOTA methods.
Related papers
- Intern-GS: Vision Model Guided Sparse-View 3D Gaussian Splatting [95.61137026932062]
Intern-GS is a novel approach to enhance the process of sparse-view Gaussian splatting.<n>We show that Intern-GS achieves state-of-the-art rendering quality across diverse datasets.
arXiv Detail & Related papers (2025-05-27T05:17:49Z) - Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse Views [45.125032766506536]
We propose Sparse2DGS, an MVS-d Gaussian Splatting pipeline for complete and accurate reconstruction.<n>Our key insight is to incorporate the geometric-prioritized enhancement schemes, allowing for direct and robust geometric learning under ill-posed conditions.<n>Sparse2DGS outperforms existing methods by notable margins while being $2times$ faster than the NeRF-based fine-tuning approach.
arXiv Detail & Related papers (2025-04-29T02:47:02Z) - EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.<n>We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs [28.381287866505637]
We propose a reconstruction by generation pipeline that leverages learned priors from video diffusion models to provide plausible interpretations for regions outside the field of view or occluded.<n>We introduce a novel scene-grounding guidance based on rendered sequences from an optimized 3DGS, which tames the diffusion model to generate consistent sequences.<n>Our method significantly improves upon the baseline and achieves state-of-the-art performance on challenging benchmarks.
arXiv Detail & Related papers (2025-03-07T01:59:05Z) - CrossView-GS: Cross-view Gaussian Splatting For Large-scale Scene Reconstruction [5.528874948395173]
We propose a novel cross-view Gaussian Splatting method for large-scale scene reconstruction based on multi-branch construction and fusion.<n>Our method achieves superior performance in novel view synthesis compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-01-03T08:24:59Z) - MonoGSDF: Exploring Monocular Geometric Cues for Gaussian Splatting-Guided Implicit Surface Reconstruction [84.07233691641193]
We introduce MonoGSDF, a novel method that couples primitives with a neural Signed Distance Field (SDF) for high-quality reconstruction.<n>To handle arbitrary-scale scenes, we propose a scaling strategy for robust generalization.<n>Experiments on real-world datasets outperforms prior methods while maintaining efficiency.
arXiv Detail & Related papers (2024-11-25T20:07:07Z) - GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views [67.34073368933814]
We propose a generalizable Gaussian Splatting approach for high-resolution image rendering under a sparse-view camera setting.
We train our Gaussian parameter regression module on human-only data or human-scene data, jointly with a depth estimation module to lift 2D parameter maps to 3D space.
Experiments on several datasets demonstrate that our method outperforms state-of-the-art methods while achieving an exceeding rendering speed.
arXiv Detail & Related papers (2024-11-18T08:18:44Z) - CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes [53.107474952492396]
CityGaussianV2 is a novel approach for large-scale scene reconstruction.<n>We implement a decomposed-gradient-based densification and depth regression technique to eliminate blurry artifacts and accelerate convergence.<n>Our method strikes a promising balance between visual quality, geometric accuracy, as well as storage and training costs.
arXiv Detail & Related papers (2024-11-01T17:59:31Z) - Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis [25.924727931514735]
Generalizable 3DGS can reconstruct new scenes from sparse-view observations in a feed-forward inference manner.
Existing methods rely heavily on epipolar priors, which can be unreliable in complex realworld scenes.
We propose eFreeSplat, an efficient feed-forward 3DGS-based model for generalizable novel view synthesis.
arXiv Detail & Related papers (2024-10-30T08:51:29Z) - PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices.
Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z) - TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers [14.708092244093665]
We develop a strategy that utilizes a predicted depth confidence map to guide accurate local feature matching.
We present a novel G-3DGS method named TranSplat, which obtains the best performance on both the RealEstate10K and ACID benchmarks.
arXiv Detail & Related papers (2024-08-25T08:37:57Z) - GaussianPro: 3D Gaussian Splatting with Progressive Propagation [49.918797726059545]
3DGS relies heavily on the point cloud produced by Structure-from-Motion (SfM) techniques.
We propose a novel method that applies a progressive propagation strategy to guide the densification of the 3D Gaussians.
Our method significantly surpasses 3DGS on the dataset, exhibiting an improvement of 1.15dB in terms of PSNR.
arXiv Detail & Related papers (2024-02-22T16:00:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.