Geometry-Consistent 4D Gaussian Splatting for Sparse-Input Dynamic View Synthesis
- URL: http://arxiv.org/abs/2511.23044v1
- Date: Fri, 28 Nov 2025 10:11:48 GMT
- Title: Geometry-Consistent 4D Gaussian Splatting for Sparse-Input Dynamic View Synthesis
- Authors: Yiwei Li, Jiannong Cao, Penghui Ruan, Divya Saxena, Songye Zhu, Yinfeng Cao,
- Abstract summary: GC-4DGS is a novel framework that infuses geometric consistency into 4D Gaussian Splatting (4DGS)<n>This paper presents GC-4DGS, a novel framework that infuses geometric consistency into 4D Gaussian Splatting (4DGS)
- Score: 17.560425604804305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gaussian Splatting has been considered as a novel way for view synthesis of dynamic scenes, which shows great potential in AIoT applications such as digital twins. However, recent dynamic Gaussian Splatting methods significantly degrade when only sparse input views are available, limiting their applicability in practice. The issue arises from the incoherent learning of 4D geometry as input views decrease. This paper presents GC-4DGS, a novel framework that infuses geometric consistency into 4D Gaussian Splatting (4DGS), offering real-time and high-quality dynamic scene rendering from sparse input views. While learning-based Multi-View Stereo (MVS) and monocular depth estimators (MDEs) provide geometry priors, directly integrating these with 4DGS yields suboptimal results due to the ill-posed nature of sparse-input 4D geometric optimization. To address these problems, we introduce a dynamic consistency checking strategy to reduce estimation uncertainties of MVS across spacetime. Furthermore, we propose a global-local depth regularization approach to distill spatiotemporal-consistent geometric information from monocular depths, thereby enhancing the coherent geometry and appearance learning within the 4D volume. Extensive experiments on the popular N3DV and Technicolor datasets validate the effectiveness of GC-4DGS in rendering quality without sacrificing efficiency. Notably, our method outperforms RF-DeRF, the latest dynamic radiance field tailored for sparse-input dynamic view synthesis, and the original 4DGS by 2.62dB and 1.58dB in PSNR, respectively, with seamless deployability on resource-constrained IoT edge devices.
Related papers
- EVolSplat4D: Efficient Volume-based Gaussian Splatting for 4D Urban Scene Synthesis [43.898895514609286]
EvolSplat4D is a feed-forward framework that moves beyond existing per-pixel paradigms by unifying volume-based and pixel-based Gaussian prediction.<n>We show that EvolSplat4D reconstructs both static and dynamic environments with superior accuracy and consistency, outperforming both per-scene optimization and state-of-the-art feed-forward baselines.
arXiv Detail & Related papers (2026-01-22T13:39:29Z) - SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting [12.927009543427259]
3D Gaussian Splatting (3DGS) has emerged as a powerful explicit representation enabling real-time, high-fidelity 3D reconstruction and novel view synthesis.<n>However, its practical use is hindered by the massive memory and computational demands required to store and render millions of Gaussians.<n>This survey provides the first unified overview of efficient 3D and 4D Gaussian Splatting techniques.
arXiv Detail & Related papers (2025-12-08T06:15:59Z) - Endo-G$^{2}$T: Geometry-Guided & Temporally Aware Time-Embedded 4DGS For Endoscopic Scenes [3.1445273839174095]
Endo-G$2$T is a geometry-guided and temporally aware training scheme for time-embedded 4DGS.<n>It achieves state-of-the-art results among monocular reconstruction baselines in EndoNeRF and StereoMIS-P1 datasets.
arXiv Detail & Related papers (2025-11-26T13:12:21Z) - Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models [79.06910348413861]
We introduce Diff4Splat, a feed-forward method that synthesizes controllable and explicit 4D scenes from a single image.<n>Given a single input image, a camera trajectory, and an optional text prompt, Diff4Splat directly predicts a deformable 3D Gaussian field that encodes appearance, geometry, and motion.
arXiv Detail & Related papers (2025-11-01T11:16:25Z) - P-4DGS: Predictive 4D Gaussian Splatting with 90$\times$ Compression [26.130131551764077]
3D Gaussian Splatting (3DGS) has garnered significant attention due to its superior scene representation fidelity and real-time rendering performance.<n>Despite achieving promising results, most existing algorithms overlook the substantial temporal and spatial redundancies inherent in dynamic scenes.<n>We propose P-4DGS, a novel dynamic 3DGS representation for compact 4D scene modeling.
arXiv Detail & Related papers (2025-10-11T05:19:41Z) - 4D Driving Scene Generation With Stereo Forcing [62.47705572424127]
Current generative models struggle to synthesize dynamic 4D driving scenes that simultaneously support temporal extrapolation and spatial novel view synthesis (NVS) without per-scene optimization.<n>We present PhiGenesis, a unified framework for 4D scene generation that extends video generation techniques with geometric and temporal consistency.
arXiv Detail & Related papers (2025-09-24T15:37:17Z) - Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction [72.54905331756076]
We introduce Geo4D, a method to repurpose video diffusion models for monocular 3D reconstruction of dynamic scenes.<n>By leveraging the strong dynamic priors captured by large-scale pre-trained video models, Geo4D can be trained using only synthetic data.<n>We propose a new multi-modal alignment algorithm to align and fuse these modalities, as well as a sliding window approach at inference time.
arXiv Detail & Related papers (2025-04-10T17:59:55Z) - Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis [22.767866875051013]
We propose GS-Diff, a novel 3DGS framework guided by a multi-view diffusion model to address limitations of current methods.<n>By generating pseudo-observations conditioned on multi-view inputs, our method transforms under-constrained 3D reconstruction problems into well-posed ones.<n> Experiments on four benchmarks demonstrate that GS-Diff consistently outperforms state-of-the-art baselines by significant margins.
arXiv Detail & Related papers (2025-04-02T17:59:46Z) - DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation [50.01520547454224]
Current generative models struggle to synthesize 4D driving scenes that simultaneously support temporal extrapolation and spatial novel view synthesis (NVS)<n>We propose DiST-4D, which disentangles the problem into two diffusion processes: DiST-T, which predicts future metric depth and multi-view RGB sequences directly from past observations, and DiST-S, which enables spatial NVS by training only on existing viewpoints while enforcing cycle consistency.<n>Experiments demonstrate that DiST-4D achieves state-of-the-art performance in both temporal prediction and NVS tasks, while also delivering competitive performance in planning-related evaluations.
arXiv Detail & Related papers (2025-03-19T13:49:48Z) - 4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives [115.67081491747943]
Dynamic 3D scene representation and novel view synthesis are crucial for enabling AR/VR and metaverse applications.<n>We reformulate the reconstruction of a time-varying 3D scene as approximating its underlying 4D volume.<n>We derive several compact variants that effectively reduce the memory footprint to address its storage bottleneck.
arXiv Detail & Related papers (2024-12-30T05:30:26Z) - Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models [116.31344506738816]
We present a novel framework, textbfDiffusion4D, for efficient and scalable 4D content generation.
We develop a 4D-aware video diffusion model capable of synthesizing orbital views of dynamic 3D assets.
Our method surpasses prior state-of-the-art techniques in terms of generation efficiency and 4D geometry consistency.
arXiv Detail & Related papers (2024-05-26T17:47:34Z) - SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer [57.506654943449796]
We propose an efficient, sparse-controlled video-to-4D framework named SC4D that decouples motion and appearance.
Our method surpasses existing methods in both quality and efficiency.
We devise a novel application that seamlessly transfers motion onto a diverse array of 4D entities.
arXiv Detail & Related papers (2024-04-04T18:05:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.