Related papers: MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction

MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction

URL: http://arxiv.org/abs/2508.04297v1
Date: Wed, 06 Aug 2025 10:34:24 GMT
Title: MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction
Authors: Yaopeng Lou, Liao Shen, Tianqi Liu, Jiaqi Li, Zihao Huang, Huiqiang Sun, Zhiguo Cao,
Abstract summary: We present Multi-Baseline Gaussian Splatting (MuRF), a feed-forward approach for novel view synthesis.<n>MuRF achieves state-of-the-art performance across multiple baseline settings and diverse scenarios.
Score: 13.941042770932794
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Multi-Baseline Gaussian Splatting (MuRF), a generalized feed-forward approach for novel view synthesis that effectively handles diverse baseline settings, including sparse input views with both small and large baselines. Specifically, we integrate features from Multi-View Stereo (MVS) and Monocular Depth Estimation (MDE) to enhance feature representations for generalizable reconstruction. Next, We propose a projection-and-sampling mechanism for deep depth fusion, which constructs a fine probability volume to guide the regression of the feature map. Furthermore, We introduce a reference-view loss to improve geometry and optimization efficiency. We leverage 3D Gaussian representations to accelerate training and inference time while enhancing rendering quality. MuRF achieves state-of-the-art performance across multiple baseline settings and diverse scenarios ranging from simple objects (DTU) to complex indoor and outdoor scenes (RealEstate10K). We also demonstrate promising zero-shot performance on the LLFF and Mip-NeRF 360 datasets.

Related papers

MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models [42.00619438358396]
We introduce MonoSplat, a novel framework that leverages rich visual priors from pre-trained monocular depth foundation models for robust Gaussian reconstruction.<n>Our approach consists of two key components: a Mono-Multi Feature Adapter that transforms monocular features into multi-view representations, and an Integrated Gaussian Prediction module.<n>We convincingly demonstrate that MonoSplat achieves superior reconstruction quality and generalization capability compared to existing methods.
arXiv Detail & Related papers (2025-05-21T07:03:16Z)
GBR: Generative Bundle Refinement for High-fidelity Gaussian Splatting and Meshing [27.747748706297497]
We propose GBR: Generative Bundle Refinement, a method for high-fidelity Gaussian splatting and meshing using only 4-6 input views.<n>GBR integrates a neural bundle adjustment module to enhance geometry accuracy and a generative depth refinement module to improve geometry fidelity.<n>GBR demonstrates the ability to reconstruct and render large-scale real-world scenes, with remarkable details using only 6 views.
arXiv Detail & Related papers (2024-12-08T12:00:25Z)
MCGS: Multiview Consistency Enhancement for Sparse-View 3D Gaussian Radiance Fields [73.49548565633123]
Radiance fields represented by 3D Gaussians excel at synthesizing novel views, offering both high training efficiency and fast rendering. Existing methods often incorporate depth priors from dense estimation networks but overlook the inherent multi-view consistency in input images. We propose a view framework based on 3D Gaussian Splatting, named MCGS, enabling scene reconstruction from sparse input views.
arXiv Detail & Related papers (2024-10-15T08:39:05Z)
MVGamba: Unify 3D Content Generation as State Space Sequence Modeling [150.80564081817786]
We introduce MVGamba, a general and lightweight Gaussian reconstruction model featuring a multi-view Gaussian reconstructor.<n>With off-the-detail multi-view diffusion models integrated, MVGamba unifies 3D generation tasks from a single image, sparse images, or text prompts.<n>Experiments demonstrate that MVGamba outperforms state-of-the-art baselines in all 3D content generation scenarios with approximately only $0.1times$ of the model size.
arXiv Detail & Related papers (2024-06-10T15:26:48Z)
FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor Scenes [50.534213038479926]
FreeSplat is capable of reconstructing geometrically consistent 3D scenes from long sequence input towards free-view synthesis. We propose a simple but effective free-view training strategy that ensures robust view synthesis across broader view range regardless of the number of views.
arXiv Detail & Related papers (2024-05-28T08:40:14Z)
MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo [54.00987996368157]
We present MVSGaussian, a new generalizable 3D Gaussian representation approach derived from Multi-View Stereo (MVS) MVSGaussian achieves real-time rendering with better synthesis quality for each scene.
arXiv Detail & Related papers (2024-05-20T17:59:30Z)
MuRF: Multi-Baseline Radiance Fields [117.55811938988256]
We present Multi-Baseline Radiance Fields (MuRF), a feed-forward approach to solving sparse view synthesis. MuRF achieves state-of-the-art performance across multiple different baseline settings. We also show promising zero-shot generalization abilities on the Mip-NeRF 360 dataset.
arXiv Detail & Related papers (2023-12-07T18:59:56Z)
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting [58.41056963451056]
We propose a few-shot view synthesis framework based on 3D Gaussian Splatting. This framework enables real-time and photo-realistic view synthesis with as few as three training views. FSGS achieves state-of-the-art performance in both accuracy and rendering efficiency across diverse datasets.
arXiv Detail & Related papers (2023-12-01T09:30:02Z)
AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network [8.127449025802436]
We present a novel recurrent multi-view stereo network based on long short-term memory (LSTM) with adaptive aggregation, namely AA-RMVSNet. We firstly introduce an intra-view aggregation module to adaptively extract image features by using context-aware convolution and multi-scale aggregation. We propose an inter-view cost volume aggregation module for adaptive pixel-wise view aggregation, which is able to preserve better-matched pairs among all views.
arXiv Detail & Related papers (2021-08-09T06:10:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.