STDR: Spatio-Temporal Decoupling for Real-Time Dynamic Scene Rendering
- URL: http://arxiv.org/abs/2505.22400v1
- Date: Wed, 28 May 2025 14:26:41 GMT
- Title: STDR: Spatio-Temporal Decoupling for Real-Time Dynamic Scene Rendering
- Authors: Zehao Li, Hao Jiang, Yujun Cai, Jianing Chen, Baolong Bi, Shuqin Gao, Honglong Zhao, Yiwei Wang, Tianlu Mao, Zhaoqi Wang,
- Abstract summary: Existing 3DGS-based methods for dynamic reconstruction often suffer from textbfSTDR (Spatio-coupling DeTemporal for Real-time rendering)<n>We propose textbfSTDR (Spatio-coupling DeTemporal for Real-time rendering), a plug-and-play module learns thattemporal probability distributions for each scene.
- Score: 15.873329633980015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although dynamic scene reconstruction has long been a fundamental challenge in 3D vision, the recent emergence of 3D Gaussian Splatting (3DGS) offers a promising direction by enabling high-quality, real-time rendering through explicit Gaussian primitives. However, existing 3DGS-based methods for dynamic reconstruction often suffer from \textit{spatio-temporal incoherence} during initialization, where canonical Gaussians are constructed by aggregating observations from multiple frames without temporal distinction. This results in spatio-temporally entangled representations, making it difficult to model dynamic motion accurately. To overcome this limitation, we propose \textbf{STDR} (Spatio-Temporal Decoupling for Real-time rendering), a plug-and-play module that learns spatio-temporal probability distributions for each Gaussian. STDR introduces a spatio-temporal mask, a separated deformation field, and a consistency regularization to jointly disentangle spatial and temporal patterns. Extensive experiments demonstrate that incorporating our module into existing 3DGS-based dynamic scene reconstruction frameworks leads to notable improvements in both reconstruction quality and spatio-temporal consistency across synthetic and real-world benchmarks.
Related papers
- HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene [11.906835503107189]
We propose HAIF-GS, a unified framework that enables structured and consistent dynamic modeling through sparse anchor-driven deformation.<n>We show that HAIF-GS significantly outperforms prior dynamic 3DGS methods in rendering quality, temporal coherence, and reconstruction efficiency.
arXiv Detail & Related papers (2025-06-11T08:45:08Z) - FreeTimeGS: Free Gaussian Primitives at Anytime and Anywhere for Dynamic Scene Reconstruction [64.30050475414947]
FreeTimeGS is a novel 4D representation that allows Gaussian primitives to appear at arbitrary time and locations.<n>Our representation possesses the strong flexibility, thus improving the ability to model dynamic 3D scenes.<n> Experiments results on several datasets show that the rendering quality of our method outperforms recent methods by a large margin.
arXiv Detail & Related papers (2025-06-05T17:59:57Z) - SHaDe: Compact and Consistent Dynamic 3D Reconstruction via Tri-Plane Deformation and Latent Diffusion [0.0]
We present a novel framework for dynamic 3D scene reconstruction that integrates three key components.<n>An explicit tri-plane deformation field, a view-conditioned canonical field with spherical harmonics (SH) attention, and a temporally-aware latent diffusion prior.<n>Our method encodes 4D scenes using three 2D feature planes that evolve over time, enabling efficient compact representation.
arXiv Detail & Related papers (2025-05-22T11:25:38Z) - EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation [14.402479944396665]
We introduce EvolvingGS, a two-stage strategy that first deforms the Gaussian model to align with the target frame, and then refines it with minimal point addition/subtraction.<n> Owing to the flexibility of the incrementally evolving representation, our method outperforms existing approaches in terms of both per-frame and temporal quality metrics.<n>Our method significantly advances the state-of-the-art in dynamic scene reconstruction, particularly for extended sequences with complex human performances.
arXiv Detail & Related papers (2025-03-07T06:01:07Z) - UrbanGS: Semantic-Guided Gaussian Splatting for Urban Scene Reconstruction [86.4386398262018]
UrbanGS uses 2D semantic maps and an existing dynamic Gaussian approach to distinguish static objects from the scene.<n>For potentially dynamic objects, we aggregate temporal information using learnable time embeddings.<n>Our approach outperforms state-of-the-art methods in reconstruction quality and efficiency.
arXiv Detail & Related papers (2024-12-04T16:59:49Z) - T-3DGS: Removing Transient Objects for 3D Scene Reconstruction [83.05271859398779]
Transient objects in video sequences can significantly degrade the quality of 3D scene reconstructions.<n>We propose T-3DGS, a novel framework that robustly filters out transient distractors during 3D reconstruction using Gaussian Splatting.
arXiv Detail & Related papers (2024-11-29T07:45:24Z) - Event-boosted Deformable 3D Gaussians for Dynamic Scene Reconstruction [50.873820265165975]
We introduce the first approach combining event cameras, which capture high-temporal-resolution, continuous motion data, with deformable 3D-GS for dynamic scene reconstruction.<n>We propose a GS-Threshold Joint Modeling strategy, creating a mutually reinforcing process that greatly improves both 3D reconstruction and threshold modeling.<n>We contribute the first event-inclusive 4D benchmark with synthetic and real-world dynamic scenes, on which our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-11-25T08:23:38Z) - GAST: Sequential Gaussian Avatars with Hierarchical Spatio-temporal Context [7.6736633105043515]
3D human avatars, through the use of canonical radiance fields and per-frame observed warping, enable high-fidelity rendering and animating.
Existing methods, which rely on either spatial SMPL(-X) poses or temporal embeddings, respectively suffer from coarse quality or limited animation flexibility.
We propose GAST, a framework that unifies 3D human modeling with 3DGS by hierarchically integrating both spatial and temporal information.
arXiv Detail & Related papers (2024-11-25T04:05:19Z) - DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes [71.61083731844282]
We present DeSiRe-GS, a self-supervised gaussian splatting representation.<n>It enables effective static-dynamic decomposition and high-fidelity surface reconstruction in complex driving scenarios.
arXiv Detail & Related papers (2024-11-18T05:49:16Z) - DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering [39.02667348492635]
3D Gaussian Splatting (3DGS) has garnered researchers attention due to their outstanding rendering quality and real-time speed.
We propose Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering (DN-4DGS)
Specifically, a Noise Suppression Strategy is introduced to change the distribution of the coordinates of the canonical 3D gaussians and suppress noise.
arXiv Detail & Related papers (2024-10-17T14:43:07Z) - Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene
Reconstruction [29.83056271799794]
Implicit neural representation has paved the way for new approaches to dynamic scene reconstruction and rendering.
We propose a deformable 3D Gaussians Splatting method that reconstructs scenes using 3D Gaussians and learns them in canonical space.
Through a differential Gaussianizer, the deformable 3D Gaussians not only achieve higher rendering quality but also real-time rendering speed.
arXiv Detail & Related papers (2023-09-22T16:04:02Z) - SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes [75.9110646062442]
We propose SceNeRFlow to reconstruct a general, non-rigid scene in a time-consistent manner.
Our method takes multi-view RGB videos and background images from static cameras with known camera parameters as input.
We show experimentally that, unlike prior work that only handles small motion, our method enables the reconstruction of studio-scale motions.
arXiv Detail & Related papers (2023-08-16T09:50:35Z) - SST: Real-time End-to-end Monocular 3D Reconstruction via Sparse
Spatial-Temporal Guidance [71.3027345302485]
Real-time monocular 3D reconstruction is a challenging problem that remains unsolved.
We propose an end-to-end 3D reconstruction network SST, which utilizes Sparse estimated points from visual SLAM system.
SST outperforms all state-of-the-art competitors, whilst keeping a high inference speed at 59 FPS.
arXiv Detail & Related papers (2022-12-13T12:17:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.