Related papers: Cross-Temporal 3D Gaussian Splatting for Sparse-View Guided Scene Update

Cross-Temporal 3D Gaussian Splatting for Sparse-View Guided Scene Update

URL: http://arxiv.org/abs/2512.00534v1
Date: Sat, 29 Nov 2025 16:00:24 GMT
Title: Cross-Temporal 3D Gaussian Splatting for Sparse-View Guided Scene Update
Authors: Zeyuan An, Yanghang Xiao, Zhiying Leng, Frederick W. B. Li, Xiaohui Liang,
Abstract summary: Updating 3D scenes from sparse-view observations is crucial for various real-world applications.<n>We propose Cross-Temporal 3D Gaussian Splatting (Cross-Temporal 3DGS), a novel framework for efficiently reconstructing and updating 3D scenes.<n> Experimental results show significant improvements over baseline methods in reconstruction quality and data efficiency.
Score: 17.581193784542357
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Maintaining consistent 3D scene representations over time is a significant challenge in computer vision. Updating 3D scenes from sparse-view observations is crucial for various real-world applications, including urban planning, disaster assessment, and historical site preservation, where dense scans are often unavailable or impractical. In this paper, we propose Cross-Temporal 3D Gaussian Splatting (Cross-Temporal 3DGS), a novel framework for efficiently reconstructing and updating 3D scenes across different time periods, using sparse images and previously captured scene priors. Our approach comprises three stages: 1) Cross-temporal camera alignment for estimating and aligning camera poses across different timestamps; 2) Interference-based confidence initialization to identify unchanged regions between timestamps, thereby guiding updates; and 3) Progressive cross-temporal optimization, which iteratively integrates historical prior information into the 3D scene to enhance reconstruction quality. Our method supports non-continuous capture, enabling not only updates using new sparse views to refine existing scenes, but also recovering past scenes from limited data with the help of current captures. Furthermore, we demonstrate the potential of this approach to achieve temporal changes using only sparse images, which can later be reconstructed into detailed 3D representations as needed. Experimental results show significant improvements over baseline methods in reconstruction quality and data efficiency, making this approach a promising solution for scene versioning, cross-temporal digital twins, and long-term spatial documentation.

Related papers

DePT3R: Joint Dense Point Tracking and 3D Reconstruction of Dynamic Scenes in a Single Forward Pass [2.0487171253259104]
DePT3R is a novel framework that simultaneously performs dense point tracking and 3D reconstruction of dynamic scenes from multiple images.<n>We validate DePT3R on several challenging benchmarks involving dynamic scenes, demonstrating strong performance and significant improvements in memory efficiency.
arXiv Detail & Related papers (2025-12-15T09:21:28Z)
LONG3R: Long Sequence Streaming 3D Reconstruction [29.79885827038617]
Long3R is a novel model designed for streaming multi-view 3D scene reconstruction over longer sequences.<n>Our model achieves real-time processing by operating recurrently, maintaining and updating memory with each new observation.<n>Experiments demonstrate that LONG3R outperforms state-of-the-art streaming methods, particularly for longer sequences.
arXiv Detail & Related papers (2025-07-24T09:55:20Z)
GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering [54.489285024494855]
Video stabilization is pivotal for video processing, as it removes unwanted shakiness while preserving the original user motion intent.<n>Existing approaches, depending on the domain they operate, suffer from several issues that degrade the user experience.<n>We introduce textbfGaVS, a novel 3D-grounded approach that reformulates video stabilization as a temporally-consistent local reconstruction and rendering' paradigm.
arXiv Detail & Related papers (2025-06-30T15:24:27Z)
PE3R: Perception-Efficient 3D Reconstruction [54.730257992806116]
Perception-Efficient 3D Reconstruction (PE3R) is a novel framework designed to enhance both accuracy and efficiency.<n>The framework achieves a minimum 9-fold speedup in 3D semantic field reconstruction, along with substantial gains in perception accuracy and reconstruction precision.
arXiv Detail & Related papers (2025-03-10T16:29:10Z)
Graph-Guided Scene Reconstruction from Images with 3D Gaussian Splatting [5.8452477457633485]
It is observed existing methods have various limitations, such as requiring precise camera poses for input and dense viewpoints for supervision.<n>We propose a novel graph-guided 3D scene reconstruction framework, GraphGS.<n>We demonstrate GraphGS achieves high-fidelity 3D reconstruction from images, which presents state-of-the-art performance through quantitative and qualitative evaluation across multiple datasets.
arXiv Detail & Related papers (2025-02-24T17:59:08Z)
EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting [87.1077910795879]
Event cameras, inspired by biological vision, record pixel-wise intensity changes asynchronously with high temporal resolution.<n>We propose Event-Aided Free-Trajectory 3DGS, which seamlessly integrates the advantages of event cameras into 3DGS.<n>We evaluate our method on the public Tanks and Temples benchmark and a newly collected real-world dataset, RealEv-DAVIS.
arXiv Detail & Related papers (2024-10-20T13:44:24Z)
VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction [59.40711222096875]
We present VastGaussian, the first method for high-quality reconstruction and real-time rendering on large scenes based on 3D Gaussian Splatting. Our approach outperforms existing NeRF-based methods and achieves state-of-the-art results on multiple large scene datasets.
arXiv Detail & Related papers (2024-02-27T11:40:50Z)
Nothing Stands Still: A Spatiotemporal Benchmark on 3D Point Cloud Registration Under Large Geometric and Temporal Change [82.31647863785923]
Building 3D geometric maps of man-made spaces are fundamental computer vision and robotics.<n>Nothing Stands Still (NSS) benchmark focuses on thetemporal registration of 3D scenes undergoing large spatial and temporal change.<n>As part of NSS, we introduce a dataset of 3D point clouds recurrently captured in large-scale building indoor environments that are under construction or renovation.
arXiv Detail & Related papers (2023-11-15T20:09:29Z)
Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth. We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z)
SCFusion: Real-time Incremental Scene Reconstruction with Semantic Completion [86.77318031029404]
We propose a framework that performs scene reconstruction and semantic scene completion jointly in an incremental and real-time manner. Our framework relies on a novel neural architecture designed to process occupancy maps and leverages voxel states to accurately and efficiently fuse semantic completion with the 3D global model.
arXiv Detail & Related papers (2020-10-26T15:31:52Z)
A Graph Attention Spatio-temporal Convolutional Network for 3D Human Pose Estimation in Video [7.647599484103065]
We improve the learning of constraints in human skeleton by modeling local global spatial information via attention mechanisms. Our approach effectively mitigates depth ambiguity and self-occlusion, generalizes to half upper body estimation, and achieves competitive performance on 2D-to-3D video pose estimation.
arXiv Detail & Related papers (2020-03-11T14:54:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.