Dynamic Gaussian Scene Reconstruction from Unsynchronized Videos
- URL: http://arxiv.org/abs/2511.11175v1
- Date: Fri, 14 Nov 2025 11:20:43 GMT
- Title: Dynamic Gaussian Scene Reconstruction from Unsynchronized Videos
- Authors: Zhixin Xu, Hengyu Zhou, Yuan Liu, Wenhan Xue, Hao Pan, Wenping Wang, Bin Wang,
- Abstract summary: Multi-view video reconstruction plays a vital role in computer vision, enabling applications in film production, virtual reality, and motion analysis.<n>We propose a novel temporal alignment strategy for high-quality 4DGS reconstruction from unsynchronized multi-view videos.
- Score: 31.54046494140498
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multi-view video reconstruction plays a vital role in computer vision, enabling applications in film production, virtual reality, and motion analysis. While recent advances such as 4D Gaussian Splatting (4DGS) have demonstrated impressive capabilities in dynamic scene reconstruction, they typically rely on the assumption that input video streams are temporally synchronized. However, in real-world scenarios, this assumption often fails due to factors like camera trigger delays or independent recording setups, leading to temporal misalignment across views and reduced reconstruction quality. To address this challenge, a novel temporal alignment strategy is proposed for high-quality 4DGS reconstruction from unsynchronized multi-view videos. Our method features a coarse-to-fine alignment module that estimates and compensates for each camera's time shift. The method first determines a coarse, frame-level offset and then refines it to achieve sub-frame accuracy. This strategy can be integrated as a readily integrable module into existing 4DGS frameworks, enhancing their robustness when handling asynchronous data. Experiments show that our approach effectively processes temporally misaligned videos and significantly enhances baseline methods.
Related papers
- Split4D: Decomposed 4D Scene Reconstruction Without Video Segmentation [76.21162972133534]
We represent a decomposed 4D scene with Freetime FeatureGS.<n>We design a streaming feature learning strategy to accurately recover it from per-image segmentation maps.<n> Experimental results on several datasets show that the reconstruction quality of our method outperforms recent methods by a large margin.
arXiv Detail & Related papers (2025-12-28T02:37:12Z) - DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images [36.562825380568384]
We introduce textbfDriving Gaussian Grounded Transformer (DGGT), a unified framework for pose-free dynamic scene reconstruction.<n>Our approach jointly predicts per-frame 3D Gaussian maps and camera parameters, disentangles dynamics with a lightweight dynamic head.<n>A diffusion-based rendering refinement further reduces motion/interpolation artifacts and improves novel-view quality under sparse inputs.
arXiv Detail & Related papers (2025-12-02T18:29:18Z) - STCDiT: Spatio-Temporally Consistent Diffusion Transformer for High-Quality Video Super-Resolution [60.06664986365803]
We present STCDiT, a video super-resolution framework built upon a pre-trained video diffusion model.<n>It aims to restore structurally faithful and temporally stable videos from degraded inputs, even under complex camera motions.
arXiv Detail & Related papers (2025-11-24T05:37:23Z) - ShapeGen4D: Towards High Quality 4D Shape Generation from Videos [85.45517487721257]
We introduce a native video-to-4D shape generation framework that synthesizes a single dynamic 3D representation end-to-end from the video.<n>Our method accurately captures non-rigid motion, volume changes, and even topological transitions without per-frame optimization.
arXiv Detail & Related papers (2025-10-07T17:58:11Z) - LVTINO: LAtent Video consisTency INverse sOlver for High Definition Video Restoration [3.2944592608677614]
We propose LVTINO, the first zero-shot or plug-and-play inverse solver for high definition video restoration with priors encoded by VCMs.<n>Our conditioning mechanism bypasses the need for automatic differentiation and achieves state-of-the-art video reconstruction quality with only a few neural function evaluations.
arXiv Detail & Related papers (2025-10-01T18:10:08Z) - 4D Driving Scene Generation With Stereo Forcing [62.47705572424127]
Current generative models struggle to synthesize dynamic 4D driving scenes that simultaneously support temporal extrapolation and spatial novel view synthesis (NVS) without per-scene optimization.<n>We present PhiGenesis, a unified framework for 4D scene generation that extends video generation techniques with geometric and temporal consistency.
arXiv Detail & Related papers (2025-09-24T15:37:17Z) - 4DGCPro: Efficient Hierarchical 4D Gaussian Compression for Progressive Volumetric Video Streaming [52.76837132019501]
We introduce 4DGCPro, a novel hierarchical 4D compression framework.<n>4DGCPro facilitates real-time mobile decoding and high-quality rendering via progressive volumetric video streaming.<n>We present an end-to-end entropy-optimized training scheme.
arXiv Detail & Related papers (2025-09-22T08:38:17Z) - Low-Cost Test-Time Adaptation for Robust Video Editing [4.707015344498921]
Video editing is a critical component of content creation that transforms raw footage into coherent works aligned with specific visual and narrative objectives.<n>Existing approaches face two major challenges: temporal inconsistencies due to failure in capturing complex motion patterns, and overfitting to simple prompts arising from limitations in UNet backbone architectures.<n>We present Vid-TTA, a lightweight test-time adaptation framework that personalizes optimization for each test video during inference through self-supervised auxiliary tasks.
arXiv Detail & Related papers (2025-07-29T14:31:17Z) - GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering [54.489285024494855]
Video stabilization is pivotal for video processing, as it removes unwanted shakiness while preserving the original user motion intent.<n>Existing approaches, depending on the domain they operate, suffer from several issues that degrade the user experience.<n>We introduce textbfGaVS, a novel 3D-grounded approach that reformulates video stabilization as a temporally-consistent local reconstruction and rendering' paradigm.
arXiv Detail & Related papers (2025-06-30T15:24:27Z) - Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.