Related papers: G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior

G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior

URL: http://arxiv.org/abs/2510.12099v1
Date: Tue, 14 Oct 2025 03:06:28 GMT
Title: G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior
Authors: Junfeng Ni, Yixin Chen, Zhifei Yang, Yu Liu, Ruijie Lu, Song-Chun Zhu, Siyuan Huang,
Abstract summary: We identify accurate geometry as the fundamental prerequisite for effectively exploiting generative models to enhance 3D scene reconstruction.<n>We incorporate this geometry guidance throughout the generative pipeline to improve visibility mask estimation, guide novel view selection, and enhance multi-view consistency when inpainting with video diffusion models.<n>Our method naturally supports single-view inputs and unposed videos, with strong generalizability in both indoor and outdoor scenarios.
Score: 53.762256749551284
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite recent advances in leveraging generative prior from pre-trained diffusion models for 3D scene reconstruction, existing methods still face two critical limitations. First, due to the lack of reliable geometric supervision, they struggle to produce high-quality reconstructions even in observed regions, let alone in unobserved areas. Second, they lack effective mechanisms to mitigate multi-view inconsistencies in the generated images, leading to severe shape-appearance ambiguities and degraded scene geometry. In this paper, we identify accurate geometry as the fundamental prerequisite for effectively exploiting generative models to enhance 3D scene reconstruction. We first propose to leverage the prevalence of planar structures to derive accurate metric-scale depth maps, providing reliable supervision in both observed and unobserved regions. Furthermore, we incorporate this geometry guidance throughout the generative pipeline to improve visibility mask estimation, guide novel view selection, and enhance multi-view consistency when inpainting with video diffusion models, resulting in accurate and consistent scene completion. Extensive experiments on Replica, ScanNet++, and DeepBlending show that our method consistently outperforms existing baselines in both geometry and appearance reconstruction, particularly for unobserved regions. Moreover, our method naturally supports single-view inputs and unposed videos, with strong generalizability in both indoor and outdoor scenarios with practical real-world applicability. The project page is available at https://dali-jack.github.io/g4splat-web/.

Related papers

Visibility-Aware Densification for 3D Gaussian Splatting in Dynamic Urban Scenes [7.253732091582086]
VAD-GS is a 3DGS framework tailored for geometry recovery in challenging urban scenes.<n>Our method identifies unreliable geometry structures via voxel-based visibility reasoning.<n>It selects informative supporting views through diversity-aware view selection, and recovers missing structures via patch matching-based stereo reconstruction.
arXiv Detail & Related papers (2025-10-10T13:22:12Z)
OracleGS: Grounding Generative Priors for Sparse-View Gaussian Splatting [78.70702961852119]
OracleGS reconciles generative completeness with regressive fidelity for sparse view Gaussian Splatting.<n>Our approach conditions the powerful generative prior on multi-view geometric evidence, filtering hallucinatory artifacts while preserving plausible completions in under-constrained regions.
arXiv Detail & Related papers (2025-09-27T11:19:32Z)
Geometry and Perception Guided Gaussians for Multiview-consistent 3D Generation from a Single Image [10.648593818811976]
Existing approaches often rely on fine-tuning pretrained 2D diffusion models or directly generating 3D information through fast network inference.<n>We present a novel method that seamlessly integrates geometry and perception information without requiring additional model training.<n> Experimental results show that we outperform existing methods on novel view synthesis and 3D reconstruction, demonstrating robust and consistent 3D object generation.
arXiv Detail & Related papers (2025-06-26T11:22:06Z)
GausSurf: Geometry-Guided 3D Gaussian Splatting for Surface Reconstruction [79.42244344704154]
GausSurf employs geometry guidance from multi-view consistency in texture-rich areas and normal priors in texture-less areas of a scene.<n>Our method surpasses state-of-the-art methods in terms of reconstruction quality and computation time.
arXiv Detail & Related papers (2024-11-29T03:54:54Z)
MonoGSDF: Exploring Monocular Geometric Cues for Gaussian Splatting-Guided Implicit Surface Reconstruction [86.87464903285208]
We introduce MonoGSDF, a novel method that couples primitives with a neural Signed Distance Field (SDF) for high-quality reconstruction.<n>To handle arbitrary-scale scenes, we propose a scaling strategy for robust generalization.<n>Experiments on real-world datasets outperforms prior methods while maintaining efficiency.
arXiv Detail & Related papers (2024-11-25T20:07:07Z)
GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.<n>Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z)
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images. We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage. We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z)
Improving Neural Indoor Surface Reconstruction with Mask-Guided Adaptive Consistency Constraints [0.6749750044497732]
We propose a two-stage training process, decouple view-dependent and view-independent colors, and leverage two novel consistency constraints to enhance detail reconstruction performance without requiring extra priors. Experiments on synthetic and real-world datasets show the capability of reducing the interference from prior estimation errors.
arXiv Detail & Related papers (2023-09-18T13:05:23Z)
DiViNeT: 3D Reconstruction from Disparate Views via Neural Template Regularization [7.488962492863031]
We present a volume rendering-based neural surface reconstruction method that takes as few as three disparate RGB images as input. Our key idea is to regularize the reconstruction, which is severely ill-posed and leaving significant gaps between the sparse views. Our approach achieves the best reconstruction quality among existing methods in the presence of such sparse views.
arXiv Detail & Related papers (2023-06-07T18:05:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.