SRSplat: Feed-Forward Super-Resolution Gaussian Splatting from Sparse Multi-View Images
- URL: http://arxiv.org/abs/2511.12040v1
- Date: Sat, 15 Nov 2025 05:17:44 GMT
- Title: SRSplat: Feed-Forward Super-Resolution Gaussian Splatting from Sparse Multi-View Images
- Authors: Xinyuan Hu, Changyue Shi, Chuxiao Yang, Minghao Chen, Jiajun Ding, Tao Wei, Chen Wei, Zhou Yu, Min Tan,
- Abstract summary: We propose textbfSRSplat, a feed-forward framework that reconstructs high-resolution 3D scenes from only a few LR views.<n>Our main insight is to compensate for the deficiency of texture information by jointly leveraging external high-quality reference images and internal texture cues.
- Score: 22.87137082795346
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Feed-forward 3D reconstruction from sparse, low-resolution (LR) images is a crucial capability for real-world applications, such as autonomous driving and embodied AI. However, existing methods often fail to recover fine texture details. This limitation stems from the inherent lack of high-frequency information in LR inputs. To address this, we propose \textbf{SRSplat}, a feed-forward framework that reconstructs high-resolution 3D scenes from only a few LR views. Our main insight is to compensate for the deficiency of texture information by jointly leveraging external high-quality reference images and internal texture cues. We first construct a scene-specific reference gallery, generated for each scene using Multimodal Large Language Models (MLLMs) and diffusion models. To integrate this external information, we introduce the \textit{Reference-Guided Feature Enhancement (RGFE)} module, which aligns and fuses features from the LR input images and their reference twin image. Subsequently, we train a decoder to predict the Gaussian primitives using the multi-view fused feature obtained from \textit{RGFE}. To further refine predicted Gaussian primitives, we introduce \textit{Texture-Aware Density Control (TADC)}, which adaptively adjusts Gaussian density based on the internal texture richness of the LR inputs. Extensive experiments demonstrate that our SRSplat outperforms existing methods on various datasets, including RealEstate10K, ACID, and DTU, and exhibits strong cross-dataset and cross-resolution generalization capabilities.
Related papers
- MVGSR: Multi-View Consistent 3D Gaussian Super-Resolution via Epipolar Guidance [13.050002358238793]
We introduce Multi-View Consistent 3D Gaussian Splatting Super-Resolution (MVGSR)<n>MVGSR focuses on integrating multi-view information for 3DGS rendering with high-frequency details and enhanced consistency.<n>Our method achieves state-of-the-art performance on both object-centric and scene-level 3DGS SR benchmarks.
arXiv Detail & Related papers (2025-12-17T03:23:12Z) - SplatSuRe: Selective Super-Resolution for Multi-view Consistent 3D Gaussian Splatting [50.36978600976209]
A natural strategy is to apply super-resolution (SR) to low-resolution (LR) input views, but independently enhancing each image introduces multi-view inconsistencies.<n>We propose SplatSuRe, a method that selectively applies SR content only in undersampled regions lacking high-frequency supervision.<n>Across Tanks & Temples, Deep Blending and Mip-NeRF 360, our approach surpasses baselines in both fidelity and perceptual quality.
arXiv Detail & Related papers (2025-12-01T20:08:39Z) - IntrinsiX: High-Quality PBR Generation using Image Priors [49.90007540430264]
We introduce IntrinsiX, a novel method that generates high-quality intrinsic images from text description.<n>In contrast to existing text-to-image models whose outputs contain baked-in scene lighting, our approach predicts physically-based rendering (PBR) maps.
arXiv Detail & Related papers (2025-04-01T17:47:48Z) - LLGS: Unsupervised Gaussian Splatting for Image Enhancement and Reconstruction in Pure Dark Environment [18.85235185556243]
We propose an unsupervised multi-view stereoscopic system based on 3D Gaussian Splatting.<n>This system aims to enhance images in low-light environments while reconstructing the scene.<n> Experiments conducted on real-world datasets demonstrate that our system outperforms state-of-the-art methods in both low-light enhancement and 3D Gaussian Splatting.
arXiv Detail & Related papers (2025-03-24T13:05:05Z) - FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction [69.63414788486578]
FreeSplatter is a scalable feed-forward framework that generates high-quality 3D Gaussians from uncalibrated sparse-view images.<n>Our approach employs a streamlined transformer architecture where self-attention blocks facilitate information exchange.<n>We develop two specialized variants--for object-centric and scene-level reconstruction--trained on comprehensive datasets.
arXiv Detail & Related papers (2024-12-12T18:52:53Z) - Enhanced Super-Resolution Training via Mimicked Alignment for Real-World Scenes [51.92255321684027]
We propose a novel plug-and-play module designed to mitigate misalignment issues by aligning LR inputs with HR images during training.
Specifically, our approach involves mimicking a novel LR sample that aligns with HR while preserving the characteristics of the original LR samples.
We comprehensively evaluate our method on synthetic and real-world datasets, demonstrating its effectiveness across a spectrum of SR models.
arXiv Detail & Related papers (2024-10-07T18:18:54Z) - MVGamba: Unify 3D Content Generation as State Space Sequence Modeling [150.80564081817786]
We introduce MVGamba, a general and lightweight Gaussian reconstruction model featuring a multi-view Gaussian reconstructor.<n>With off-the-detail multi-view diffusion models integrated, MVGamba unifies 3D generation tasks from a single image, sparse images, or text prompts.<n>Experiments demonstrate that MVGamba outperforms state-of-the-art baselines in all 3D content generation scenarios with approximately only $0.1times$ of the model size.
arXiv Detail & Related papers (2024-06-10T15:26:48Z) - SRGS: Super-Resolution 3D Gaussian Splatting [14.26021476067791]
We propose Super-Resolution 3D Gaussian Splatting (SRGS) to perform the optimization in a high-resolution (HR) space.
The sub-pixel constraint is introduced for the increased viewpoints in HR space, exploiting the sub-pixel cross-view information of the multiple low-resolution (LR) views.
Our method achieves high rendering quality on HRNVS only with LR inputs, outperforming state-of-the-art methods on challenging datasets such as Mip-NeRF 360 and Tanks & Temples.
arXiv Detail & Related papers (2024-04-16T06:58:30Z) - Towards Real-World Burst Image Super-Resolution: Benchmark and Method [93.73429028287038]
In this paper, we establish a large-scale real-world burst super-resolution dataset, i.e., RealBSR, to explore the faithful reconstruction of image details from multiple frames.
We also introduce a Federated Burst Affinity network (FBAnet) to investigate non-trivial pixel-wise displacement among images under real-world image degradation.
arXiv Detail & Related papers (2023-09-09T14:11:37Z) - Reference-based Image Super-Resolution with Deformable Attention
Transformer [62.71769634254654]
RefSR aims to exploit auxiliary reference (Ref) images to super-resolve low-resolution (LR) images.
This paper proposes a deformable attention Transformer, namely DATSR, with multiple scales.
Experiments demonstrate that our DATSR achieves state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-07-25T07:07:00Z) - Deep Burst Super-Resolution [165.90445859851448]
We propose a novel architecture for the burst super-resolution task.
Our network takes multiple noisy RAW images as input, and generates a denoised, super-resolved RGB image as output.
In order to enable training and evaluation on real-world data, we additionally introduce the BurstSR dataset.
arXiv Detail & Related papers (2021-01-26T18:57:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.