STD-GS: Exploring Frame-Event Interaction for SpatioTemporal-Disentangled Gaussian Splatting to Reconstruct High-Dynamic Scene
- URL: http://arxiv.org/abs/2506.23157v1
- Date: Sun, 29 Jun 2025 09:32:06 GMT
- Title: STD-GS: Exploring Frame-Event Interaction for SpatioTemporal-Disentangled Gaussian Splatting to Reconstruct High-Dynamic Scene
- Authors: Hanyu Zhou, Haonan Wang, Haoyue Liu, Yuxing Duan, Luxin Yan, Gim Hee Lee,
- Abstract summary: existing methods adopt unified representation model (e.g. Gaussian) to directly match scene from frame camera.<n>However, this unified paradigm fails in the potential temporal features of objects due to frame features and discontinuous spatial features between background and objects.<n>In this work, we introduce event camera to compensate for frame camera, and propose a distemporal-dentangle Gaussian splatting framework for high-dynamic scene reconstruction.
- Score: 54.418259038624406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-dynamic scene reconstruction aims to represent static background with rigid spatial features and dynamic objects with deformed continuous spatiotemporal features. Typically, existing methods adopt unified representation model (e.g., Gaussian) to directly match the spatiotemporal features of dynamic scene from frame camera. However, this unified paradigm fails in the potential discontinuous temporal features of objects due to frame imaging and the heterogeneous spatial features between background and objects. To address this issue, we disentangle the spatiotemporal features into various latent representations to alleviate the spatiotemporal mismatching between background and objects. In this work, we introduce event camera to compensate for frame camera, and propose a spatiotemporal-disentangled Gaussian splatting framework for high-dynamic scene reconstruction. As for dynamic scene, we figure out that background and objects have appearance discrepancy in frame-based spatial features and motion discrepancy in event-based temporal features, which motivates us to distinguish the spatiotemporal features between background and objects via clustering. As for dynamic object, we discover that Gaussian representations and event data share the consistent spatiotemporal characteristic, which could serve as a prior to guide the spatiotemporal disentanglement of object Gaussians. Within Gaussian splatting framework, the cumulative scene-object disentanglement can improve the spatiotemporal discrimination between background and objects to render the time-continuous dynamic scene. Extensive experiments have been performed to verify the superiority of the proposed method.
Related papers
- SplitGaussian: Reconstructing Dynamic Scenes via Visual Geometry Decomposition [14.381223353489062]
We propose textbfSplitGaussian, a novel framework that explicitly decomposes scene representations into static and dynamic components.<n>SplitGaussian outperforms prior state-of-the-art methods in rendering quality, geometric stability, and motion separation.
arXiv Detail & Related papers (2025-08-06T09:00:13Z) - FreeTimeGS: Free Gaussian Primitives at Anytime and Anywhere for Dynamic Scene Reconstruction [64.30050475414947]
FreeTimeGS is a novel 4D representation that allows Gaussian primitives to appear at arbitrary time and locations.<n>Our representation possesses the strong flexibility, thus improving the ability to model dynamic 3D scenes.<n> Experiments results on several datasets show that the rendering quality of our method outperforms recent methods by a large margin.
arXiv Detail & Related papers (2025-06-05T17:59:57Z) - STDR: Spatio-Temporal Decoupling for Real-Time Dynamic Scene Rendering [15.873329633980015]
Existing 3DGS-based methods for dynamic reconstruction often suffer from textbfSTDR (Spatio-coupling DeTemporal for Real-time rendering)<n>We propose textbfSTDR (Spatio-coupling DeTemporal for Real-time rendering), a plug-and-play module learns thattemporal probability distributions for each scene.
arXiv Detail & Related papers (2025-05-28T14:26:41Z) - Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better [61.381599921020175]
Temporal consistency is critical in video prediction to ensure that outputs are coherent and free of artifacts.<n>Traditional methods, such as temporal attention and 3D convolution, may struggle with significant object motion.<n>We propose the Tracktention Layer, a novel architectural component that explicitly integrates motion information using point tracks.
arXiv Detail & Related papers (2025-03-25T17:58:48Z) - Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories [28.701879490459675]
We aim to learn an implicit motion field parameterized by a neural network to predict the movement of novel points within same domain.
We exploit intrinsic regularization provided by SIREN, and modify the input layer to produce atemporally smooth motion field.
Our experiments assess the model's performance in predicting unseen point trajectories and its application in temporal mesh alignment with deformation.
arXiv Detail & Related papers (2024-06-05T21:02:10Z) - Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering [36.111845416439095]
We present a unified representation model, called Periodic Vibration Gaussian (PVG)
PVG builds upon the efficient 3D Gaussian splatting technique, originally designed for static scene representation.
PVG exhibits 900-fold acceleration in rendering over the best alternative.
arXiv Detail & Related papers (2023-11-30T13:53:50Z) - SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes [75.9110646062442]
We propose SceNeRFlow to reconstruct a general, non-rigid scene in a time-consistent manner.
Our method takes multi-view RGB videos and background images from static cameras with known camera parameters as input.
We show experimentally that, unlike prior work that only handles small motion, our method enables the reconstruction of studio-scale motions.
arXiv Detail & Related papers (2023-08-16T09:50:35Z) - DynaVol: Unsupervised Learning for Dynamic Scenes through Object-Centric
Voxelization [67.85434518679382]
We present DynaVol, a 3D scene generative model that unifies geometric structures and object-centric learning.
The key idea is to perform object-centric voxelization to capture the 3D nature of the scene.
voxel features evolve over time through a canonical-space deformation function, forming the basis for global representation learning.
arXiv Detail & Related papers (2023-04-30T05:29:28Z) - Dynamic Scene Novel View Synthesis via Deferred Spatio-temporal
Consistency [18.036582072609882]
Structures (SfM) and novel view synthesis (NVS) are presented.
SfM produces noisy-temporally reconstructed sparse clouds, resulting in NVS with temporally inconsistent effects.
We demonstrate our algorithm on real-world dynamic scenes against classic more recent learning-based baseline approaches.
arXiv Detail & Related papers (2021-09-02T15:29:45Z) - Event-based Motion Segmentation with Spatio-Temporal Graph Cuts [51.17064599766138]
We have developed a method to identify independently objects acquired with an event-based camera.
The method performs on par or better than the state of the art without having to predetermine the number of expected moving objects.
arXiv Detail & Related papers (2020-12-16T04:06:02Z) - CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations [72.4716073597902]
We propose a method to learn object Canonical Point Cloud Representations of dynamically or moving objects.
We demonstrate the effectiveness of our method on several applications including shape reconstruction, camera pose estimation, continuoustemporal sequence reconstruction, and correspondence estimation.
arXiv Detail & Related papers (2020-08-06T17:58:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.