Broadening View Synthesis of Dynamic Scenes from Constrained Monocular Videos
- URL: http://arxiv.org/abs/2512.14406v1
- Date: Tue, 16 Dec 2025 13:43:41 GMT
- Title: Broadening View Synthesis of Dynamic Scenes from Constrained Monocular Videos
- Authors: Le Jiang, Shaotong Zhu, Yedi Luo, Shayda Moezzi, Sarah Ostadabbas,
- Abstract summary: We introduce Expanded Dynamic NeRF (ExpanDyNeRF) to enable realistic synthesis under large-angle rotations.<n>We also present the first synthetic multiview dataset for dynamic scenes with explicit side-view supervision.
- Score: 10.32344202599476
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In dynamic Neural Radiance Fields (NeRF) systems, state-of-the-art novel view synthesis methods often fail under significant viewpoint deviations, producing unstable and unrealistic renderings. To address this, we introduce Expanded Dynamic NeRF (ExpanDyNeRF), a monocular NeRF framework that leverages Gaussian splatting priors and a pseudo-ground-truth generation strategy to enable realistic synthesis under large-angle rotations. ExpanDyNeRF optimizes density and color features to improve scene reconstruction from challenging perspectives. We also present the Synthetic Dynamic Multiview (SynDM) dataset, the first synthetic multiview dataset for dynamic scenes with explicit side-view supervision-created using a custom GTA V-based rendering pipeline. Quantitative and qualitative results on SynDM and real-world datasets demonstrate that ExpanDyNeRF significantly outperforms existing dynamic NeRF methods in rendering fidelity under extreme viewpoint shifts. Further details are provided in the supplementary materials.
Related papers
- CoDe-NeRF: Neural Rendering via Dynamic Coefficient Decomposition [28.96821860867129]
We present a neural rendering framework based on dynamic coefficient decomposition.<n>Our approach decomposes complex appearance into a shared, static neural basis that encodes intrinsic material properties.<n>We show that our method can produce sharper and more realistic specular highlights compared to existing techniques.
arXiv Detail & Related papers (2025-08-08T18:30:02Z) - Enhanced Velocity Field Modeling for Gaussian Video Reconstruction [21.54297055995746]
High-fidelity 3D video reconstruction is essential for enabling real-time rendering of dynamic scenes with realistic motion in virtual and augmented reality (VR/AR)<n>We propose a flow-empowered velocity field modeling scheme tailored for Gaussian video reconstruction, dubbed FlowGaussian-VR.<n>It consists of two core components: a velocity field rendering (VFR) pipeline which enables optical flow-based optimization, and a flow-assisted adaptive densification (FAD) strategy that adjusts the number and size of Gaussians in dynamic regions.
arXiv Detail & Related papers (2025-07-31T16:26:22Z) - HoliGS: Holistic Gaussian Splatting for Embodied View Synthesis [59.25751939710903]
We propose a novel deformable Gaussian splatting framework that addresses embodied view synthesis from long monocular RGB videos.<n>Our method leverages invertible Gaussian Splatting deformation networks to reconstruct large-scale, dynamic environments accurately.<n>Results highlight a practical and scalable solution for EVS in real-world scenarios.
arXiv Detail & Related papers (2025-06-24T03:54:40Z) - DGS-LRM: Real-Time Deformable 3D Gaussian Reconstruction From Monocular Videos [52.46386528202226]
We introduce the Deformable Gaussian Splats Large Reconstruction Model (DGS-LRM)<n>It is the first feed-forward method predicting deformable 3D Gaussian splats from a monocular posed video of any dynamic scene.<n>It achieves performance on par with state-of-the-art monocular video 3D tracking methods.
arXiv Detail & Related papers (2025-06-11T17:59:58Z) - MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction [44.592566642185425]
MuDG is an innovative framework that integrates Multi-modal Diffusion model with Gaussian Splatting (GS) for Urban Scene Reconstruction.<n>We show that MuDG outperforms existing methods in both reconstruction and photorealistic synthesis quality.
arXiv Detail & Related papers (2025-03-13T17:48:41Z) - NeRF-Casting: Improved View-Dependent Appearance with Consistent Reflections [57.63028964831785]
Recent works have improved NeRF's ability to render detailed specular appearance of distant environment illumination, but are unable to synthesize consistent reflections of closer content.
We address these issues with an approach based on ray tracing.
Instead of querying an expensive neural network for the outgoing view-dependent radiance at points along each camera ray, our model casts rays from these points and traces them through the NeRF representation to render feature vectors.
arXiv Detail & Related papers (2024-05-23T17:59:57Z) - NeRF-VPT: Learning Novel View Representations with Neural Radiance
Fields via View Prompt Tuning [63.39461847093663]
We propose NeRF-VPT, an innovative method for novel view synthesis to address these challenges.
Our proposed NeRF-VPT employs a cascading view prompt tuning paradigm, wherein RGB information gained from preceding rendering outcomes serves as instructive visual prompts for subsequent rendering stages.
NeRF-VPT only requires sampling RGB data from previous stage renderings as priors at each training stage, without relying on extra guidance or complex techniques.
arXiv Detail & Related papers (2024-03-02T22:08:10Z) - NeVRF: Neural Video-based Radiance Fields for Long-duration Sequences [53.8501224122952]
We propose a novel neural video-based radiance fields (NeVRF) representation.
NeVRF marries neural radiance field with image-based rendering to support photo-realistic novel view synthesis on long-duration dynamic inward-looking scenes.
Our experiments demonstrate the effectiveness of NeVRF in enabling long-duration sequence rendering, sequential data reconstruction, and compact data storage.
arXiv Detail & Related papers (2023-12-10T11:14:30Z) - Anisotropic Neural Representation Learning for High-Quality Neural
Rendering [0.0]
We propose an anisotropic neural representation learning method that utilizes learnable view-dependent features to improve scene representation and reconstruction.
Our method is flexiable and can be plugged into NeRF-based frameworks.
arXiv Detail & Related papers (2023-11-30T07:29:30Z) - EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via
Self-Supervision [85.17951804790515]
EmerNeRF is a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes.
It simultaneously captures scene geometry, appearance, motion, and semantics via self-bootstrapping.
Our method achieves state-of-the-art performance in sensor simulation.
arXiv Detail & Related papers (2023-11-03T17:59:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.