Related papers: MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos

MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos

URL: http://arxiv.org/abs/2406.00434v2
Date: Tue, 29 Oct 2024 09:50:00 GMT
Title: MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos
Authors: Qingming Liu, Yuan Liu, Jiepeng Wang, Xianqiang Lyv, Peng Wang, Wenping Wang, Junhui Hou,
Abstract summary: MoDGS is a new pipeline to render novel views of dynamic scenes from a casually captured monocular video. Experiments demonstrate MoDGS is able to render high-quality novel view images of dynamic scenes from just a casually captured monocular video.
Score: 65.31707882676292
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this paper, we propose MoDGS, a new pipeline to render novel views of dy namic scenes from a casually captured monocular video. Previous monocular dynamic NeRF or Gaussian Splatting methods strongly rely on the rapid move ment of input cameras to construct multiview consistency but struggle to recon struct dynamic scenes on casually captured input videos whose cameras are either static or move slowly. To address this challenging task, MoDGS adopts recent single-view depth estimation methods to guide the learning of the dynamic scene. Then, a novel 3D-aware initialization method is proposed to learn a reasonable deformation field and a new robust depth loss is proposed to guide the learning of dynamic scene geometry. Comprehensive experiments demonstrate that MoDGS is able to render high-quality novel view images of dynamic scenes from just a casually captured monocular video, which outperforms state-of-the-art meth ods by a significant margin. The code will be publicly available.

Related papers

Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos [101.48581851337703]
We present BTimer, the first motion-aware feed-forward model for real-time reconstruction and novel view synthesis of dynamic scenes. Our approach reconstructs the full scene in a 3D Gaussian Splatting representation at a given target ('bullet') timestamp by aggregating information from all the context frames. Given a casual monocular dynamic video, BTimer reconstructs a bullet-time scene within 150ms while reaching state-of-the-art performance on both static and dynamic scene datasets.
arXiv Detail & Related papers (2024-12-04T18:15:06Z)
RoDyGS: Robust Dynamic Gaussian Splatting for Casual Videos [39.384910552854926]
We present RoDyGS, an optimization pipeline for dynamic Gaussian Splatting from casual videos. It effectively learns motion and underlying geometry of scenes by separating dynamic and static primitives. We also introduce a comprehensive benchmark, Kubric-MRig, that provides extensive camera and object motion along with simultaneous multi-view captures.
arXiv Detail & Related papers (2024-12-04T07:02:49Z)
Shape of Motion: 4D Reconstruction from a Single Video [51.04575075620677]
We introduce a method capable of reconstructing generic dynamic scenes, featuring explicit, full-sequence-long 3D motion. We exploit the low-dimensional structure of 3D motion by representing scene motion with a compact set of SE3 motion bases. Our method achieves state-of-the-art performance for both long-range 3D/2D motion estimation and novel view synthesis on dynamic scenes.
arXiv Detail & Related papers (2024-07-18T17:59:08Z)
Modeling Ambient Scene Dynamics for Free-view Synthesis [31.233859111566613]
We introduce a novel method for dynamic free-view synthesis of an ambient scenes from a monocular capture. Our method builds upon the recent advancements in 3D Gaussian Splatting (3DGS) that can faithfully reconstruct complex static scenes.
arXiv Detail & Related papers (2024-06-13T17:59:11Z)
MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds [27.802537831023347]
We introduce 4D Motion Scaffolds (MoSca), a neural information processing system designed to reconstruct and synthesize novel views of dynamic scenes from monocular videos captured casually in the wild. Experiments demonstrate state-of-the-art performance on dynamic rendering benchmarks.
arXiv Detail & Related papers (2024-05-27T17:59:07Z)
Decoupling Dynamic Monocular Videos for Dynamic View Synthesis [50.93409250217699]
We tackle the challenge of dynamic view synthesis from dynamic monocular videos in an unsupervised fashion. Specifically, we decouple the motion of the dynamic objects into object motion and camera motion, respectively regularized by proposed unsupervised surface consistency and patch-based multi-view constraints.
arXiv Detail & Related papers (2023-04-04T11:25:44Z)
DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene. We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views. We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z)
Unsupervised Monocular Depth Reconstruction of Non-Rigid Scenes [87.91841050957714]
We present an unsupervised monocular framework for dense depth estimation of dynamic scenes. We derive a training objective that aims to opportunistically preserve pairwise distances between reconstructed 3D points. Our method provides promising results, demonstrating its capability of reconstructing 3D from challenging videos of non-rigid scenes.
arXiv Detail & Related papers (2020-12-31T16:02:03Z)
Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video [76.19076002661157]
Non-Rigid Neural Radiance Fields (NR-NeRF) is a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes. We show that even a single consumer-grade camera is sufficient to synthesize sophisticated renderings of a dynamic scene from novel virtual camera views.
arXiv Detail & Related papers (2020-12-22T18:46:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.