AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction
- URL: http://arxiv.org/abs/2601.00796v1
- Date: Fri, 02 Jan 2026 18:59:55 GMT
- Title: AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction
- Authors: Jiewen Chan, Zhenjun Zhao, Yu-Lun Liu,
- Abstract summary: We propose a unified framework addressing both frequency adaptivity and temporal continuity in dynamic scene modeling.<n>Experiments on Tap-Vid DAVIS demonstrate state-of-the-art performance.
- Score: 9.63361043358898
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reconstructing dynamic 3D scenes from monocular videos requires simultaneously capturing high-frequency appearance details and temporally continuous motion. Existing methods using single Gaussian primitives are limited by their low-pass filtering nature, while standard Gabor functions introduce energy instability. Moreover, lack of temporal continuity constraints often leads to motion artifacts during interpolation. We propose AdaGaR, a unified framework addressing both frequency adaptivity and temporal continuity in explicit dynamic scene modeling. We introduce Adaptive Gabor Representation, extending Gaussians through learnable frequency weights and adaptive energy compensation to balance detail capture and stability. For temporal continuity, we employ Cubic Hermite Splines with Temporal Curvature Regularization to ensure smooth motion evolution. An Adaptive Initialization mechanism combining depth estimation, point tracking, and foreground masks establishes stable point cloud distributions in early training. Experiments on Tap-Vid DAVIS demonstrate state-of-the-art performance (PSNR 35.49, SSIM 0.9433, LPIPS 0.0723) and strong generalization across frame interpolation, depth consistency, video editing, and stereo view synthesis. Project page: https://jiewenchan.github.io/AdaGaR/
Related papers
- Towards Arbitrary Motion Completing via Hierarchical Continuous Representation [64.6525112550758]
We propose a novel parametric activation-induced hierarchical implicit representation framework, called NAME, based on Implicit Representations (INRs)<n>Our method introduces a hierarchical temporal encoding mechanism that extracts features from motion sequences at multiple temporal scales, enabling effective capture of intricate temporal patterns.
arXiv Detail & Related papers (2025-12-24T14:07:04Z) - DiTVR: Zero-Shot Diffusion Transformer for Video Restoration [48.97196894658511]
DiTVR is a zero shot video restoration framework that couples a diffusion transformer with trajectory aware attention and a flow consistent sampler.<n>Our attention mechanism aligns tokens along optical flow trajectories, with particular emphasis on vital layers that exhibit the highest sensitivity to temporal dynamics.<n>The flow guided sampler injects data consistency only into low-frequency bands, preserving high frequency priors while accelerating cache.
arXiv Detail & Related papers (2025-08-11T09:54:45Z) - VDEGaussian: Video Diffusion Enhanced 4D Gaussian Splatting for Dynamic Urban Scenes Modeling [68.65587507038539]
We present a novel video diffusion-enhanced 4D Gaussian Splatting framework for dynamic urban scene modeling.<n>Our key insight is to distill robust, temporally consistent priors from a test-time adapted video diffusion model.<n>Our method significantly enhances dynamic modeling, especially for fast-moving objects, achieving an approximate PSNR gain of 2 dB.
arXiv Detail & Related papers (2025-08-04T07:24:05Z) - D-FCGS: Feedforward Compression of Dynamic Gaussian Splatting for Free-Viewpoint Videos [12.24209693552492]
Free-viewpoint video (FVV) enables immersive 3D experiences, but efficient compression of dynamic 3D representations remains a major challenge.<n>This paper presents Feedforward Compression of Dynamic Gaussian Splatting (D-FCGS), a novel feedforward framework for compressing temporally correlated Gaussian point cloud sequences.<n> Experiments show that it matches the rate-distortion performance of optimization-based methods, achieving over 40 times compression in under 2 seconds.
arXiv Detail & Related papers (2025-07-08T10:39:32Z) - STD-GS: Exploring Frame-Event Interaction for SpatioTemporal-Disentangled Gaussian Splatting to Reconstruct High-Dynamic Scene [54.418259038624406]
existing methods adopt unified representation model (e.g. Gaussian) to directly match scene from frame camera.<n>However, this unified paradigm fails in the potential temporal features of objects due to frame features and discontinuous spatial features between background and objects.<n>In this work, we introduce event camera to compensate for frame camera, and propose a distemporal-dentangle Gaussian splatting framework for high-dynamic scene reconstruction.
arXiv Detail & Related papers (2025-06-29T09:32:06Z) - HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene [24.789092424634536]
We propose HAIF-GS, a unified framework that enables structured and consistent dynamic modeling through sparse anchor-driven deformation.<n>We show that HAIF-GS significantly outperforms prior dynamic 3DGS methods in rendering quality, temporal coherence, and reconstruction efficiency.
arXiv Detail & Related papers (2025-06-11T08:45:08Z) - STDR: Spatio-Temporal Decoupling for Real-Time Dynamic Scene Rendering [15.873329633980015]
Existing 3DGS-based methods for dynamic reconstruction often suffer from textbfSTDR (Spatio-coupling DeTemporal for Real-time rendering)<n>We propose textbfSTDR (Spatio-coupling DeTemporal for Real-time rendering), a plug-and-play module learns thattemporal probability distributions for each scene.
arXiv Detail & Related papers (2025-05-28T14:26:41Z) - NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors [6.7253553166914255]
We present NormalCrafter to leverage the inherent temporal priors of video diffusion models.<n>To secure high-fidelity normal estimation across sequences, we propose Semantic Feature Regularization.<n>We also introduce a two-stage training protocol to preserve spatial accuracy while maintaining long temporal context.
arXiv Detail & Related papers (2025-04-15T17:39:07Z) - EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation [14.402479944396665]
We introduce EvolvingGS, a two-stage strategy that first deforms the Gaussian model to align with the target frame, and then refines it with minimal point addition/subtraction.<n> Owing to the flexibility of the incrementally evolving representation, our method outperforms existing approaches in terms of both per-frame and temporal quality metrics.<n>Our method significantly advances the state-of-the-art in dynamic scene reconstruction, particularly for extended sequences with complex human performances.
arXiv Detail & Related papers (2025-03-07T06:01:07Z) - DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes [71.61083731844282]
We present DeSiRe-GS, a self-supervised gaussian splatting representation.<n>It enables effective static-dynamic decomposition and high-fidelity surface reconstruction in complex driving scenarios.
arXiv Detail & Related papers (2024-11-18T05:49:16Z) - Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering [49.36767999382054]
We present a unified representation model, called Periodic Vibration Gaussian (PVG)<n>PVG builds upon the efficient 3D Gaussian splatting technique, originally designed for static scene representation.<n>PVG exhibits 900-fold acceleration in rendering over the best alternative.
arXiv Detail & Related papers (2023-11-30T13:53:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.