RoDyGS: Robust Dynamic Gaussian Splatting for Casual Videos
- URL: http://arxiv.org/abs/2412.03077v1
- Date: Wed, 04 Dec 2024 07:02:49 GMT
- Title: RoDyGS: Robust Dynamic Gaussian Splatting for Casual Videos
- Authors: Yoonwoo Jeong, Junmyeong Lee, Hoseung Choi, Minsu Cho,
- Abstract summary: We present RoDyGS, an optimization pipeline for dynamic Gaussian Splatting from casual videos.
It effectively learns motion and underlying geometry of scenes by separating dynamic and static primitives.
We also introduce a comprehensive benchmark, Kubric-MRig, that provides extensive camera and object motion along with simultaneous multi-view captures.
- Score: 39.384910552854926
- License:
- Abstract: Dynamic view synthesis (DVS) has advanced remarkably in recent years, achieving high-fidelity rendering while reducing computational costs. Despite the progress, optimizing dynamic neural fields from casual videos remains challenging, as these videos do not provide direct 3D information, such as camera trajectories or the underlying scene geometry. In this work, we present RoDyGS, an optimization pipeline for dynamic Gaussian Splatting from casual videos. It effectively learns motion and underlying geometry of scenes by separating dynamic and static primitives, and ensures that the learned motion and geometry are physically plausible by incorporating motion and geometric regularization terms. We also introduce a comprehensive benchmark, Kubric-MRig, that provides extensive camera and object motion along with simultaneous multi-view captures, features that are absent in previous benchmarks. Experimental results demonstrate that the proposed method significantly outperforms previous pose-free dynamic neural fields and achieves competitive rendering quality compared to existing pose-free static neural fields. The code and data are publicly available at https://rodygs.github.io/.
Related papers
- GaussianVideo: Efficient Video Representation via Hierarchical Gaussian Splatting [28.981174430968643]
We introduce a novel neural representation that combines 3D Gaussian splatting with continuous camera motion modeling.
Experimental results show that our hierarchical learning, combined with robust camera motion modeling, captures complex dynamic scenes with strong temporal consistency.
This memory-efficient approach achieves high-quality rendering at impressive speeds.
arXiv Detail & Related papers (2025-01-08T19:01:12Z) - SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video [26.468480933928458]
We propose SplineGS, a COLMAP-free dynamic 3D Gaussian Splatting framework for high-quality reconstruction and fast rendering from monocular videos.
At its core is a novel Motion-Adaptive Spline (MAS) method, which represents continuous dynamic 3D Gaussian trajectories.
We present a joint optimization strategy for camera parameter estimation and 3D Gaussian attributes, leveraging photometric and geometric consistency.
arXiv Detail & Related papers (2024-12-13T09:09:14Z) - DynSUP: Dynamic Gaussian Splatting from An Unposed Image Pair [41.78277294238276]
We propose a method that can use only two images without prior poses to fit Gaussians in dynamic environments.
This strategy decomposes dynamic scenes into piece-wise rigid components, and jointly estimates the camera pose and motions of dynamic objects.
Experiments on both synthetic and real-world datasets demonstrate that our method significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2024-12-01T15:25:33Z) - D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video [53.83936023443193]
This paper contributes to the field by introducing a new synthesis method for dynamic novel view from monocular video, such as smartphone captures.
Our approach represents the as a $textitdynamic neural point cloud$, an implicit time-conditioned point cloud that encodes local geometry and appearance in separate hash-encoded neural feature grids.
arXiv Detail & Related papers (2024-06-14T14:35:44Z) - MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos [65.31707882676292]
MoDGS is a new pipeline to render novel views of dynamic scenes from a casually captured monocular video.
Experiments demonstrate MoDGS is able to render high-quality novel view images of dynamic scenes from just a casually captured monocular video.
arXiv Detail & Related papers (2024-06-01T13:20:46Z) - 4D-Rotor Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic Scenes [33.14021987166436]
We introduce 4DRotorGS, a novel method that represents dynamic scenes with anisotropic 4D XYZT Gaussians.
As an explicit spatial-temporal representation, 4DRotorGS demonstrates powerful capabilities for modeling complicated dynamics and fine details.
We further implement our temporal slicing and acceleration framework, achieving real-time rendering speeds of up to 277 FPS on an 3090 GPU and 583 FPS on a 4090 GPU.
arXiv Detail & Related papers (2024-02-05T18:59:04Z) - Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis [58.5779956899918]
We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements.
We follow an analysis-by-synthesis framework, inspired by recent work that models scenes as a collection of 3D Gaussians.
We demonstrate a large number of downstream applications enabled by our representation, including first-person view synthesis, dynamic compositional scene synthesis, and 4D video editing.
arXiv Detail & Related papers (2023-08-18T17:59:21Z) - Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes.
We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature.
We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z) - NeuralDiff: Segmenting 3D objects that move in egocentric videos [92.95176458079047]
We study the problem of decomposing the observed 3D scene into a static background and a dynamic foreground.
This task is reminiscent of the classic background subtraction problem, but is significantly harder because all parts of the scene, static and dynamic, generate a large apparent motion.
In particular, we consider egocentric videos and further separate the dynamic component into objects and the actor that observes and moves them.
arXiv Detail & Related papers (2021-10-19T12:51:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.