4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization
- URL: http://arxiv.org/abs/2411.08879v1
- Date: Wed, 13 Nov 2024 18:56:39 GMT
- Title: 4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization
- Authors: Mijeong Kim, Jongwoo Lim, Bohyung Han,
- Abstract summary: We propose a novel 4D Gaussian Splatting (4DGS) algorithm for dynamic scenes from casually recorded monocular videos.
Our experiments show that the proposed method improves the performance of 4DGS reconstruction from a video captured by a handheld monocular camera.
- Score: 43.81271239333774
- License:
- Abstract: Novel view synthesis of dynamic scenes is becoming important in various applications, including augmented and virtual reality. We propose a novel 4D Gaussian Splatting (4DGS) algorithm for dynamic scenes from casually recorded monocular videos. To overcome the overfitting problem of existing work for these real-world videos, we introduce an uncertainty-aware regularization that identifies uncertain regions with few observations and selectively imposes additional priors based on diffusion models and depth smoothness on such regions. This approach improves both the performance of novel view synthesis and the quality of training image reconstruction. We also identify the initialization problem of 4DGS in fast-moving dynamic regions, where the Structure from Motion (SfM) algorithm fails to provide reliable 3D landmarks. To initialize Gaussian primitives in such regions, we present a dynamic region densification method using the estimated depth maps and scene flow. Our experiments show that the proposed method improves the performance of 4DGS reconstruction from a video captured by a handheld monocular camera and also exhibits promising results in few-shot static scene reconstruction.
Related papers
- 4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives [116.2042238179433]
In this paper, we frame dynamic scenes as unconstrained 4D volume learning problems.
We represent a target dynamic scene using a collection of 4D Gaussian primitives with explicit geometry and appearance features.
This approach can capture relevant information in space and time by fitting the underlying photorealistic-temporal volume.
Notably, our 4DGS model is the first solution that supports real-time rendering of high-resolution, novel views for complex dynamic scenes.
arXiv Detail & Related papers (2024-12-30T05:30:26Z) - Urban4D: Semantic-Guided 4D Gaussian Splatting for Urban Scene Reconstruction [86.4386398262018]
Urban4D is a semantic-guided decomposition strategy inspired by advances in deep 2D semantic map generation.
Our approach distinguishes potentially dynamic objects through reliable semantic Gaussians.
Experiments on real-world datasets demonstrate that Urban4D achieves comparable or better quality than previous state-of-the-art methods.
arXiv Detail & Related papers (2024-12-04T16:59:49Z) - Dynamics-Aware Gaussian Splatting Streaming Towards Fast On-the-Fly Training for 4D Reconstruction [12.111389926333592]
Current 3DGS-based streaming methods treat the Gaussian primitives uniformly and constantly renew the densified Gaussians.
We propose a novel three-stage pipeline for iterative streamable 4D dynamic spatial reconstruction.
Our method achieves state-of-the-art performance in online 4D reconstruction, demonstrating a 20% improvement in on-the-fly training speed, superior representation quality, and real-time rendering capability.
arXiv Detail & Related papers (2024-11-22T10:47:47Z) - SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction [24.33543853742041]
3D Gaussian Splatting (3DGS) has emerged as a practical and scalable reconstruction method.
We propose an optimization strategy that effectively regularizes splat features by modeling them as the outputs of a corresponding implicit neural field.
Our approach effectively handles static and dynamic cases, as demonstrated by extensive testing across different setups and scene complexities.
arXiv Detail & Related papers (2024-09-17T14:04:20Z) - MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds [27.802537831023347]
We introduce 4D Motion Scaffolds (MoSca), a modern 4D reconstruction system designed to reconstruct and synthesize novel views of dynamic scenes from monocular videos captured casually in the wild.
Experiments demonstrate state-of-the-art performance on dynamic rendering benchmarks and its effectiveness on real videos.
arXiv Detail & Related papers (2024-05-27T17:59:07Z) - SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer [57.506654943449796]
We propose an efficient, sparse-controlled video-to-4D framework named SC4D that decouples motion and appearance.
Our method surpasses existing methods in both quality and efficiency.
We devise a novel application that seamlessly transfers motion onto a diverse array of 4D entities.
arXiv Detail & Related papers (2024-04-04T18:05:18Z) - Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction [89.53963284958037]
We propose a novel motion-aware enhancement framework for dynamic scene reconstruction.
Specifically, we first establish a correspondence between 3D Gaussian movements and pixel-level flow.
For the prevalent deformation-based paradigm that presents a harder optimization problem, a transient-aware deformation auxiliary module is proposed.
arXiv Detail & Related papers (2024-03-18T03:46:26Z) - DRSM: efficient neural 4d decomposition for dynamic reconstruction in
stationary monocular cameras [21.07910546072467]
We present a novel framework to tackle 4D decomposition problem for dynamic scenes in monocular cameras.
Our framework utilizes decomposed static and dynamic feature planes to represent 4D scenes and emphasizes the learning of dynamic regions through dense ray casting.
arXiv Detail & Related papers (2024-02-01T16:38:51Z) - Diffusion Priors for Dynamic View Synthesis from Monocular Videos [59.42406064983643]
Dynamic novel view synthesis aims to capture the temporal evolution of visual content within videos.
We first finetune a pretrained RGB-D diffusion model on the video frames using a customization technique.
We distill the knowledge from the finetuned model to a 4D representations encompassing both dynamic and static Neural Radiance Fields.
arXiv Detail & Related papers (2024-01-10T23:26:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.