MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models
- URL: http://arxiv.org/abs/2505.15185v1
- Date: Wed, 21 May 2025 07:03:16 GMT
- Title: MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models
- Authors: Yifan Liu, Keyu Fan, Weihao Yu, Chenxin Li, Hao Lu, Yixuan Yuan,
- Abstract summary: We introduce MonoSplat, a novel framework that leverages rich visual priors from pre-trained monocular depth foundation models for robust Gaussian reconstruction.<n>Our approach consists of two key components: a Mono-Multi Feature Adapter that transforms monocular features into multi-view representations, and an Integrated Gaussian Prediction module.<n>We convincingly demonstrate that MonoSplat achieves superior reconstruction quality and generalization capability compared to existing methods.
- Score: 42.00619438358396
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in generalizable 3D Gaussian Splatting have demonstrated promising results in real-time high-fidelity rendering without per-scene optimization, yet existing approaches still struggle to handle unfamiliar visual content during inference on novel scenes due to limited generalizability. To address this challenge, we introduce MonoSplat, a novel framework that leverages rich visual priors from pre-trained monocular depth foundation models for robust Gaussian reconstruction. Our approach consists of two key components: a Mono-Multi Feature Adapter that transforms monocular features into multi-view representations, coupled with an Integrated Gaussian Prediction module that effectively fuses both feature types for precise Gaussian generation. Through the Adapter's lightweight attention mechanism, features are seamlessly aligned and aggregated across views while preserving valuable monocular priors, enabling the Prediction module to generate Gaussian primitives with accurate geometry and appearance. Through extensive experiments on diverse real-world datasets, we convincingly demonstrate that MonoSplat achieves superior reconstruction quality and generalization capability compared to existing methods while maintaining computational efficiency with minimal trainable parameters. Codes are available at https://github.com/CUHK-AIM-Group/MonoSplat.
Related papers
- MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction [13.941042770932794]
We present Multi-Baseline Gaussian Splatting (MuRF), a feed-forward approach for novel view synthesis.<n>MuRF achieves state-of-the-art performance across multiple baseline settings and diverse scenarios.
arXiv Detail & Related papers (2025-08-06T10:34:24Z) - CrossView-GS: Cross-view Gaussian Splatting For Large-scale Scene Reconstruction [5.528874948395173]
We propose a novel cross-view Gaussian Splatting method for large-scale scene reconstruction based on multi-branch construction and fusion.<n>Our method achieves superior performance in novel view synthesis compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-01-03T08:24:59Z) - MonoGSDF: Exploring Monocular Geometric Cues for Gaussian Splatting-Guided Implicit Surface Reconstruction [84.07233691641193]
We introduce MonoGSDF, a novel method that couples primitives with a neural Signed Distance Field (SDF) for high-quality reconstruction.<n>To handle arbitrary-scale scenes, we propose a scaling strategy for robust generalization.<n>Experiments on real-world datasets outperforms prior methods while maintaining efficiency.
arXiv Detail & Related papers (2024-11-25T20:07:07Z) - GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views [67.34073368933814]
We propose a generalizable Gaussian Splatting approach for high-resolution image rendering under a sparse-view camera setting.
We train our Gaussian parameter regression module on human-only data or human-scene data, jointly with a depth estimation module to lift 2D parameter maps to 3D space.
Experiments on several datasets demonstrate that our method outperforms state-of-the-art methods while achieving an exceeding rendering speed.
arXiv Detail & Related papers (2024-11-18T08:18:44Z) - MCGS: Multiview Consistency Enhancement for Sparse-View 3D Gaussian Radiance Fields [73.49548565633123]
Radiance fields represented by 3D Gaussians excel at synthesizing novel views, offering both high training efficiency and fast rendering.
Existing methods often incorporate depth priors from dense estimation networks but overlook the inherent multi-view consistency in input images.
We propose a view framework based on 3D Gaussian Splatting, named MCGS, enabling scene reconstruction from sparse input views.
arXiv Detail & Related papers (2024-10-15T08:39:05Z) - HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction [46.269350101349715]
HiSplat is a novel framework for generalizable 3D Gaussian Splatting.
It generates hierarchical 3D Gaussians via a coarse-to-fine strategy.
It significantly enhances reconstruction quality and cross-dataset generalization.
arXiv Detail & Related papers (2024-10-08T17:59:32Z) - MVGamba: Unify 3D Content Generation as State Space Sequence Modeling [150.80564081817786]
We introduce MVGamba, a general and lightweight Gaussian reconstruction model featuring a multi-view Gaussian reconstructor.<n>With off-the-detail multi-view diffusion models integrated, MVGamba unifies 3D generation tasks from a single image, sparse images, or text prompts.<n>Experiments demonstrate that MVGamba outperforms state-of-the-art baselines in all 3D content generation scenarios with approximately only $0.1times$ of the model size.
arXiv Detail & Related papers (2024-06-10T15:26:48Z) - MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo [54.00987996368157]
We present MVSGaussian, a new generalizable 3D Gaussian representation approach derived from Multi-View Stereo (MVS)
MVSGaussian achieves real-time rendering with better synthesis quality for each scene.
arXiv Detail & Related papers (2024-05-20T17:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.